Auto-Sized MLP from Gym Spaces

Planning, Dynamics & Decision Systems DS practice problem on Onlearn.

Difficulty: medium.

Topics: Auto-Sized MLP from Gym Spaces, Multi-Layer Perceptron, Gymnasium Space API, Dynamic Layer Inference, Backpropagation, Weight Initialization, Deep Learning, Control Theory, Software Engineering, Robotics, Mathematical Optimization, Neural Network Architectures, Reinforcement Learning Environments, Dynamic Programming, API Design Patterns, Tensor Manipulation.

In reinforcement learning, neural network policies must interface directly with the observation and action spaces of an environment. A common practical pattern is to automatically determine the input and output dimensions of a Multi Layer Perceptron (MLP) from environment space definitions, rather than hard coding them. Implement a function auto mlp from spaces that takes: obs space: a dictionary describing the observation space act space: a dictionary describing the action space hidden layers: a list of integers specifying hidden layer widths Each space dictionary has a 'type' key and type specific fields: 'Box': has a 'shape' tuple (e.g., (4,) or (3, 4)). The flat dimension is the product of all elements in the shape. 'Discrete': has an 'n' integer. The dimension equals n (representing one hot encoding for inputs or logit outputs). 'MultiBinary': has an 'n' integer. The dimension equals n. The function should compute the MLP architecture and return a dictionary with: 'input dim': the flattened input dimension from the observation space 'output dim': the output dimension from the action space 'layer shapes': a list of tuples, one per layer, each containing a weight shape tuple and a bias shape tuple: ((in features, out features), (out features,)) 'total params': the total count of trainable parameters (all weights and biases)