Hashing for Memory-Efficient Tile Coding
Representation Learning, Advanced Theory & Miscellaneous DS practice problem on Onlearn.
Difficulty: medium.
Topics: Hashing for Memory-Efficient Tile Coding, Tile Coding, Collision Probability, Feature Vector Sparsity, Hash Table Indexing, Generalization Error, Reinforcement Learning, Function Approximation, Data Structures, Computational Complexity, Numerical Analysis, Sparse Feature Representation, Linear Function Approximation, Locality Sensitive Hashing, Memory Management, State Space Discretization.
Implement a memory efficient tile coding feature representation using hashing, commonly used in reinforcement learning with linear function approximation. Tile coding creates multiple overlapping grids (tilings) over a continuous state space. Each tiling is offset from the others so that nearby states share active tiles across tilings, enabling smooth generalization. However, for high dimensional spaces with many tilings and fine resolution, the total number of possible tiles can be astronomically large. Hashing compresses these tile indices into a fixed size memory array, trading a small chance of collisions for massive memory savings. Your function should: 1. Accept a continuous state vector, tiling configuration, state space bounds, a hash table size, and an optional weight vector 2. For each tiling, compute a systematic offset to shift the grid, then determine which tile the state falls into along each dimension 3. Combine the tiling index and per dimension tile coordinates into a composite key, then hash this key into the range [0, memory size) using a deterministic hash function 4. Return the sorted list of active (hashed) tile indices and a value estimate Offset scheme : For tiling t and dimension d, the offset is t tile width d / num tilings, where tile width d = (state highs[d] state lows[d]) / num tiles per dim[d]. Tile coordinate : For each dimension, shift the state by subtracting the lower bound and adding the offset, then divide by tile width and take the floor. Hash function : Use the following deterministic prime based hash. Given a coordinate tuple (tiling index, coord 0, coord 1, ...), compute h = sum(coords[i] primes[i % len(primes)]) where primes = [509, 521, 523, 541, 547, 557, 563, 569, 571, 577], then return abs(h) % memory size. Value estimate : If a weight vector is provided, sum the weights at each active tile index (including duplicates from hash collisions). Otherwise return 0.0. Note: The returned list of active tiles may contain duplicate indices when hash collisions occur between different tilings.