Rainbow DQN Implementation

Advanced & Deep RL DS practice problem on Onlearn.

Difficulty: hard.

Topics: Understanding and Implementing the Rainbow DQN Architecture, Dueling DQN Architecture, Advantage Estimation, Categorical Distributional RL, Noisy Linear Layers, Prioritized Experience Replay, Reinforcement Learning, Deep Learning, Optimization Theory, Probability Theory, Linear Algebra, Value-based Methods, Temporal Difference Learning, Neural Network Architecture, Experience Replay, Policy Gradient Foundations.

Implement a simplified Rainbow DQN agent framework. Specifically, implement the Dueling Network architecture where the Q value is decomposed into a state value function V(s) and an advantage function A(s, a). The final Q output should follow the formula: Q(s, a) = V(s) + (A(s, a) 1/|A| sum(A(s, a'))).