Discounted Return

Foundations & Tabular RL DS practice problem on Onlearn.

Difficulty: easy.

Topics: Understanding the Discounted Return in Reinforcement Learning, Discount Factor Gamma, Horizon Length, Geometric Series Convergence, Time Step Indexing, Reward Signal Processing, Reinforcement Learning, Dynamic Programming, Mathematical Optimization, Probability Theory, Sequence Analysis, Markov Decision Processes, Cumulative Reward, Temporal Difference Learning, Infinite Horizons, Value Function Estimation.

Given a sequence of immediate rewards [r 0, r 1, ..., r n] and a discount factor gamma (0 <= gamma <= 1), calculate the total discounted return G 0. The formula is defined as G 0 = sum {t=0}^{n} (gamma^t r t). Write a function that takes a list of rewards and the discount factor as inputs and returns the calculated return.