1. A machine in excellent condition earns $100 profit per week, a machine in good condition earns $70 per week, and a machine in poor condition earns $20 per week. At the beginning of any week a machine can be sent out for repairs at a cost of $90. A machine sent out for repairs returns in excellent condition at the beginning of the next week. If a machine is not repaired the condition of the machine at the beginning of next week follows the following probability distribution.
The company wishes to maximize its expected discounted profit over an infinite horizon with = 0.9. Use an initial policy,
See work on following page. My answer is
State Decisions Transition Probabilities Returns
Excellent Good Poor
1.Excellent Condition 1. Repair 1 0 0 10
2. Don't Repair 0.7 0.2 0.1 100
2. Good Condition 1. Repair 1 0 0 -20
2. Don't Repair 0 0.7 0.3 70
3. Poor Condition 1. Repair 1 0 0 -70
2. Don't Repair 0 0.1 0.9 20
For α=0.9 and a policy :
Policy did not change terminate.