GRU Cell
Sequence Models & Generative Models DS practice problem on Onlearn.
Difficulty: medium.
Topics: GRU Cell, Reset Gate, Update Gate, Hidden State Candidate, Vanishing Gradient Mitigation, Sigmoid Activation Function, Deep Learning Foundations, Sequence Modeling, Optimization Theory, Information Theory, Computational Linear Algebra, Recurrent Neural Networks, Gating Mechanisms, Gradient Flow Dynamics, Weight Initialization Strategies, Backpropagation Through Time.
Problem Implement a Gated Recurrent Unit (GRU) cell forward pass. The GRU is a type of recurrent neural network architecture that uses gating mechanisms to control the flow of information, helping to mitigate the vanishing gradient problem. A GRU cell computes a new hidden state given an input vector and the previous hidden state using update and reset gates. Input Parameters: x: Input vector of shape (input size,) h prev: Previous hidden state of shape (hidden size,) W z, W r, W h: Weight matrices for input of shape (hidden size, input size) U z, U r, U h: Weight matrices for hidden state of shape (hidden size, hidden size) b z, b r, b h: Bias vectors of shape (hidden size,) Output: h next: New hidden state of shape (hidden size,) The GRU uses sigmoid and tanh activation functions. The update gate controls how much of the previous hidden state to retain, while the reset gate controls how much of the previous hidden state to forget when computing the candidate hidden state.