Tiling Offset Strategy Comparison

Representation Learning, Advanced Theory & Miscellaneous DS practice problem on Onlearn.

Difficulty: medium.

Topics: Tiling Offset Strategy Comparison, Tiling Offset, Padding Strategy, Stride Alignment, Cache Locality, Overlap-Add Method, Computer Vision, Numerical Analysis, Computational Geometry, Parallel Computing, Signal Processing, Image Patching, Boundary Condition Handling, Spatial Data Structures, Memory Access Patterns, Convolutional Operations.

Implement a function that computes active tile indices and pairwise feature similarity between states under two different tiling offset strategies used in tile coding for reinforcement learning with continuous state spaces. Tile coding overlays multiple offset grids (tilings) over a continuous state space. Each tiling partitions the space into tiles, and a state activates exactly one tile per tiling. The offset strategy determines how tilings are displaced relative to each other, which significantly affects generalization properties. Given: states: a list of states, where each state is a list of floats (one per dimension) num tilings: the number of tilings to overlay tiles per dim: a list of integers specifying how many tiles each dimension is divided into per tiling state low: a list of floats for the lower bound of each dimension state high: a list of floats for the upper bound of each dimension strategy: a string, either "uniform" or "asymmetric" The tile width for dimension d is (state high[d] state low[d]) / tiles per dim[d]. For the uniform strategy, each tiling t (0 indexed) is shifted by t tile width d / num tilings in every dimension d. For the asymmetric strategy, each tiling t is shifted by t (2 d + 1) tile width d / num tilings in dimension d, where d is the 0 indexed dimension. This uses different displacement units per dimension to decorrelate tile boundaries. For each tiling and dimension, the tile index is computed by flooring the adjusted position, then clipping to [0, tiles per dim[d] 1]. Tile indices within a tiling are flattened using row major order. Global tile indices add t tiles per tiling where tiles per tiling is the product of all tiles per dim values. Return a tuple of: 1. A list of lists of active global tile indices (one list per state, each of length num tilings) 2. A similarity matrix (list of lists) where entry [i][j] is the fraction of shared active tiles between states i and j, rounded to 4 decimal places