The Gambler's Problem
Using Value Iteration to find the optimal betting policy.
P(Heads):
0.40
Run Value Iteration
Value Estimates (Probability of Winning)
Final Policy (Stake)