Epsilon-Greedy Action Selection for n-Armed Bandit

Foundations & Tabular RL DS practice problem on Onlearn.

Difficulty: easy.

Topics: Understanding Epsilon-Greedy Action Selection in Multi-Armed Bandits, Epsilon Parameter, Argmax Selection, Random Sampling, Action Space Indexing, Tie-breaking in Greedy Selection, Reinforcement Learning, Probability Theory, Decision Theory, Optimization, Statistics, Exploration-Exploitation Trade-off, Multi-Armed Bandit Problem, Action-Value Estimation, Greedy Policy, Stochastic Processes.

Implement an epsilon greedy action selection function for an n armed bandit problem. The function should take the current estimated action values (Q values), the exploration rate epsilon, and a random seed. It should return the index of the selected action.

dsFoundations & Tabular RL

Tutor

Waking the tutor…

Foundations & Tabular RL

0 of 24 solved

Back to roadmap