T-SNE Gradient Calculation
Calculus & Optimization DS practice problem on Onlearn.
Difficulty: medium.
Topics: T-SNE Gradient Calculation, Kullback-Leibler Divergence, Student's t-distribution, Cauchy Distribution, Pairwise Affinity Matrix, Perplexity Parameter, Calculus, Optimization Theory, Probability Theory, Linear Algebra, Manifold Learning, Multivariate Differentiation, Gradient-Based Optimization, Information Geometry, Dimensionality Reduction, Kernel Methods.
Implement the gradient calculation for t SNE (t distributed Stochastic Neighbor Embedding), a popular dimensionality reduction algorithm used for visualizing high dimensional data. t SNE works by minimizing the KL divergence between the joint probability distribution P in the high dimensional space and the joint probability distribution Q in the low dimensional embedding space. The Q distribution uses Student's t distribution with one degree of freedom. Write a function that computes the gradient of the t SNE cost function with respect to the low dimensional embedding Y. Your function should: 1. Take the joint probability matrix P from the high dimensional space (symmetric, non negative, diagonal is zero) 2. Take the current low dimensional embedding Y as an (n, d) array where n is the number of points and d is the embedding dimension 3. Compute the Q distribution based on the current embedding using the Student's t distribution 4. Calculate and return the gradient for each point in the embedding The function should return the gradient as a numpy array with the same shape as Y, rounded to 4 decimal places.