Kanerva Coding for High-Dimensional Spaces
Representation Learning, Advanced Theory & Miscellaneous DS practice problem on Onlearn.
Difficulty: hard.
Topics: Kanerva Coding for High-Dimensional Spaces, Kanerva Address Space, Hard Location Activation, Johnson-Lindenstrauss Lemma, Sparse Vector Quantization, Superpositional Encoding, Information Theory, High-Dimensional Geometry, Sparse Distributed Memory, Neural Network Architectures, Stochastic Processes, Hamming Distance Metrics, Hyperdimensional Computing, Associative Memory Models, Random Projection Methods, Sparse Coding Algorithms.
Implement a function kanerva coding td that performs value function approximation using Kanerva coding combined with semi gradient TD(0) learning. Kanerva coding is a sparse binary feature representation for continuous state spaces. A set of prototype points are placed in the state space. For any given state, a binary feature vector is constructed by checking which prototypes lie within a specified distance threshold of that state. These binary features are then used with a linear function approximator for value prediction. Parameters: prototypes: A numpy array of shape (k, d) representing k prototype points in d dimensional space. threshold: A float representing the Euclidean distance threshold. A prototype is activated (feature = 1) if its Euclidean distance to the current state is less than or equal to the threshold. episodes: A list of episodes. Each episode is a list of (state, reward, next state, done) tuples, where state and next state are lists of length d. gamma: Float, the discount factor. alpha: Float, the learning rate for semi gradient TD(0). query states: A numpy array of shape (m, d) representing states to evaluate after training. Requirements: Initialize the weight vector (length k) to all zeros. For each transition, construct the binary feature vector for the current state, compute the estimated value, compute the TD error (using bootstrapped next state value unless terminal), and update the weights using the semi gradient TD(0) rule. When a state activates no prototypes, its estimated value is 0. Process all episodes in order, updating weights online. Returns: A tuple (weights, values) where weights is a list of floats of length k (rounded to 4 decimal places) and values is a list of floats of length m (rounded to 4 decimal places) representing estimated values for the query states.