Position-wise Feed-Forward Block with Residual and Dropout
Sequence Models & Generative Models DS practice problem on Onlearn.
Difficulty: easy.
Topics: Position-wise Feed-Forward Block with Residual and Dropout, Position-wise Feed-Forward Network, Residual Connection, Dropout Probability, Layer Normalization, Activation Functions, Deep Learning Architectures, Numerical Linear Algebra, Optimization Theory, Regularization Techniques, Sequence Modeling, Transformer Components, Matrix Transformations, Stochastic Gradient Methods, Generalization Strategies, Attention Mechanisms.
Implement a Position wise Feed Forward Network (FFN) with residual connection and dropout, as used in Transformer architectures. The block should take an input vector, apply two linear transformations with a ReLU activation in between, apply dropout, then add a residual connection from the input. Round outputs to 4 decimal places for reproducibility.