A person often finds that she is up to 1 hour late for work. If she is from 1 to 30 minutes late, $4 is deducted from her paycheck; if she is from 31 to 60 minutes late for work, $8 is deducted from her paycheck. If she drives to work at her normal speed (which is well under the speed limit), she can arrive in 20 minutes However, if she exceeds the speed limit a little here and there on her way to work, she can get there in 10 minutes, but she runs the risk of getting a speeding ticket. With probability 1/8 she will get caught speeding and will be fined $20 and delayed 10 minutes, so that it takes 20 minutes to reach work.
The transition probabilities for s tomorrow if she speeds to work today are given by
Note that there are no transition probabilities for (20, ∞) and (-10, 9), because she will get to work on time and from 1 to 30 minutes late, respectively, regardless of whether she speeds. Hence, speeding when in these states would not be a logical choice. Also note that the transition probabilities imply that the later she is for work and the more she has to rush to get there, the more likely she is to leave for work earlier the next day. She wishes to determine when she should speed and when she should take her time getting to work in order to minimize her (longrun) expected average cost per day.
(a) Formulate this problem as a Markov decision process by identifying the states and decisions and then finding the Cik.
(b) Identify all the (stationary deterministic) policies. For each one, find the transition matrix and write an expression for the (longrun) expected average cost per period in terms of the unknown steady-state probabilities
(c) Use your OR Courseware to find these steady-state probabilities for each policy. Then evaluate the expression obtained in part (b) to find the optimal policy by exhaustive enumeration.