Windy Gridworld with Sarsa

Foundations & Tabular RL DS practice problem on Onlearn.

Difficulty: medium.

Topics: Understanding Temporal Difference Learning in Stochastic Environments, Epsilon-Greedy Action Selection, Q-Table Initialization, Windy Environment Dynamics, Episode Termination Conditions, TD Error Calculation, Reinforcement Learning, Markov Decision Processes, Stochastic Processes, Dynamic Programming, Probability Theory, On-policy Control, Temporal Difference Learning, Exploration vs Exploitation, Gridworld Environments, Value Function Approximation.

Implement a Sarsa agent to navigate a 10x7 gridworld with a wind effect. The wind pushes the agent upward by a varying number of cells (0, 1, or 2) depending on the column. The agent must reach the goal state (7, 3) from start (0, 3) with a discount factor of 1.0, step size of 0.5, and epsilon of 0.1.

dsFoundations & Tabular RL

Tutor

Waking the tutor…

Foundations & Tabular RL

0 of 24 solved

Back to roadmap