GeLU Activation Function

Neural Units & Activations DS practice problem on Onlearn.

Difficulty: medium.

Topics: Understanding and Implementing the Gaussian Error Linear Unit (GeLU), Error Function (erf), Cumulative Distribution Function (CDF), Sigmoid Gating Mechanism, Vanishing Gradient Mitigation, Taylor Series Expansion, Deep Learning, Neural Networks, Calculus, Optimization, Probabilistic Modeling, Non-linear Activation Functions, Backpropagation Gradients, Normal Distribution Properties, Function Approximation, Transformer Architectures.

Implement the GeLU activation function. You are required to provide both the exact mathematical implementation using the error function (math.erf) and the widely used tanh approximation. The function should accept a float or a numpy array and return the transformed values.