Gradient Clipping by Global Norm

Backpropagation, Training & Optimization DS practice problem on Onlearn.

Difficulty: medium.

Topics: Gradient Clipping by Global Norm, Exploding Gradients, L2 Norm, Global Scaling Factor, Chain Rule, Weight Update Clipping, Optimization Theory, Numerical Analysis, Deep Learning Foundations, Calculus, Software Engineering, Gradient-Based Optimization, Training Stability, Vector Norms, Backpropagation, Hyperparameter Tuning.

Implement gradient clipping by global norm. Given a list of gradient arrays (representing gradients for different parameters) and a maximum norm threshold, compute the global L2 norm across all gradients. If this global norm exceeds the threshold, scale down all gradients proportionally so that the global norm equals the threshold. Return the clipped gradients maintaining the original structure.