Consider a Markov decision problem in which the stationary policies k and k∗ each satisfy (4.50) and each correspond to ergodic Markov chains.
(a) Show that if rk∗ + [Pk∗ ]w∗ ≥ rk + [Pk]w∗ is not satisfied with equality, then g∗ > g.
(b) Show that rk∗ + [Pk∗ ]w∗ = rk + [Pk]w∗ Hint: Use (a).
(c) Find the relationship between the relative gain vector wk for policy k and the relative-gain vector w∗ for policy k∗. Hint: Show that rk + [Pk]w∗ = ge + w∗; what does this say about w and w∗?
(d) Suppose that policy k uses decision 1 in state 1 and policy k∗ uses decision 2 in state 1 (i.e., k1 = 1 for policy kand k1 = 2 for policy k∗). What is the relationship between r(k), P(k), P(k), ... , P(k) for k equal to 1 and 2?
(e) Now suppose that policy k uses decision 1 in each state and policy k∗ uses decision 2 in each state. Is it possible that r(1) > r(2) for all i? Explain carefully.
(f) Now assume that r(1) is the same for all i. Does this change your answer to (e)?Explain.
Text Book: Stochastic Processes: Theory for Applications By Robert G. Gallager.