Temporal Difference Error

Foundations & Tabular RL DS practice problem on Onlearn.

Difficulty: medium.

Topics: Understanding Temporal Difference (TD) Learning, TD Error Formulation, Discount Factor Impact, State-Value Function Update, One-step lookahead, Prediction Error Analysis, Reinforcement Learning, Dynamic Programming, Stochastic Processes, Optimization, Probability Theory, Value Function Estimation, Bellman Equations, Markov Decision Processes, Bootstrapping, Temporal Difference Learning.

Implement a function to compute the Temporal Difference (TD) error for a single transition in a Reinforcement Learning environment. Given the reward 'r', the discount factor 'gamma', the current state value 'v s', and the next state value 'v next s', compute the TD error.