Flatten and Unflatten Hierarchical Gym Spaces

Planning, Dynamics & Decision Systems DS practice problem on Onlearn.

Difficulty: medium.

Topics: Flatten and Unflatten Hierarchical Gym Spaces, Box Space Flattening, Dict Space Unflattening, Tuple Space Iteration, NumPy Array Contiguity, Recursive Tree Mapping, Reinforcement Learning, Software Engineering, Control Theory, Data Structures, Robotics Middleware, Gymnasium Environment Design, Serialization Protocols, State Space Representation, Recursive Data Traversal, Memory Layout Optimization.

In reinforcement learning, environments often expose complex, hierarchical observation and action spaces. For example, a robotics environment might provide observations as a nested dictionary containing camera images, joint positions, and goal coordinates at various levels of nesting. However, many RL algorithms (policy networks, replay buffers, etc.) require flat 1D vectors as input. Implement two functions that convert between hierarchical data structures and flat numpy arrays: flatten space(space, data) : Takes a space descriptor and corresponding nested data, and returns a single 1D numpy float array with all leaf values concatenated. unflatten space(space, flat array) : Takes a space descriptor and a 1D numpy array, and reconstructs the original nested data structure. The space descriptor format is: An integer n represents a leaf space of dimension n (the corresponding data is a numpy array of length n) A dict {key1: subspace1, key2: subspace2, ...} represents a named collection of sub spaces (like Gym's Dict space) A list [subspace1, subspace2, ...] represents an ordered collection of sub spaces (like Gym's Tuple space) For deterministic ordering, dictionary keys must always be processed in sorted (lexicographic) order . List elements are processed in their natural index order. The functions must handle arbitrary nesting depth. The flatten space function should return a 1D numpy array of dtype float. The unflatten space function should return dicts, lists, and numpy arrays matching the space structure.