Derivative of Cross-Entropy Loss w.r.t. Logits
Calculus & Optimization DS practice problem on Onlearn.
Difficulty: hard.
Topics: Understanding the Analytical Gradient of Cross-Entropy Loss with Respect to Logits, One-hot Encoding, Logit Normalization, Numerical Stability of Log-Sum-Exp, Partial Derivatives, Categorical Cross-Entropy Loss, Multivariable Calculus, Probability Theory, Information Theory, Optimization Theory, Linear Algebra, Softmax Function, Maximum Likelihood Estimation, Gradient Descent, Jacobian Matrices, Chain Rule.
Given a vector of logits z and a one hot encoded ground truth vector y, compute the analytical gradient of the Categorical Cross Entropy Loss with respect to the logits. Implement a function that returns the gradient vector p y, where p is the softmax transformed output of z.