The Discounted Return for a Given Trajectory

Foundations & Tabular RL DS practice problem on Onlearn.

Difficulty: easy.

Topics: Understanding the Cumulative Discounted Return in Reinforcement Learning, Discount Factor, Episodic Tasks, Cumulative Summation, Geometric Series, Trajectory Sampling, Reinforcement Learning, Probability Theory, Dynamic Programming, Stochastic Processes, Mathematical Optimization, Markov Decision Processes, Reward Functions, Return Definition, Time Horizons, Value Function Estimation.

In Reinforcement Learning, the return G t is defined as the sum of discounted rewards: G t = sum(gamma^k R {t+k+1}) for k=0 to infinity. Given a list of rewards [r 0, r 1, ..., r n] obtained from a trajectory and a discount factor gamma (0 <= gamma <= 1), write a function to calculate the total discounted return starting from time step 0.

dsFoundations & Tabular RL

Tutor

Waking the tutor…

Foundations & Tabular RL

0 of 24 solved

Back to roadmap