Temporal Difference Error
Foundations & Tabular RL DS practice problem on Onlearn.
Difficulty: medium.
Topics: Understanding Temporal Difference (TD) Learning, TD Error Formulation, Discount Factor Impact, State-Value Function Update, One-step lookahead, Prediction Error Analysis, Reinforcement Learning, Dynamic Programming, Stochastic Processes, Optimization, Probability Theory, Value Function Estimation, Bellman Equations, Markov Decision Processes, Bootstrapping, Temporal Difference Learning.
Implement a function to compute the Temporal Difference (TD) error for a single transition in a Reinforcement Learning environment. Given the reward 'r', the discount factor 'gamma', the current state value 'v s', and the next state value 'v next s', compute the TD error.