Get started
Start with Mathematical Foundations
Vectors & Geometry
Back to modulesModule 06 ยท 0/68
Module 06
Deep Learning Foundations
0/68253211
Submodule 01Neural Units & Activations0/20
- 1Single NeuronEasy
- 2Single Neuron with BackpropagationMed.
- 3Sigmoid Activation FunctionEasy
- 4Tanh Activation FunctionMed.
- 5ReLU Activation FunctionEasy
- 6Softmax Activation FunctionEasy
- 7Implementation of Log Softmax FunctionEasy
- 8Leaky ReLU Activation FunctionEasy
- 9ELU Activation FunctionEasy
- 10The SELU Activation FunctionEasy
- 11The Softplus Activation FunctionEasy
- 12PReLU Forward and Backward PassMed.
- 13GeLU Activation FunctionMed.
- 14The Mish Activation FunctionMed.
- 15The Hardtanh Activation FunctionEasy
- 16Dynamic Tanh: Normalization-Free Transformer ActivationEasy
- 17The Square ReLU Activation FunctionEasy
- 18The Hard Sigmoid Activation FunctionEasy
- 19The Softsign Activation FunctionEasy
- 20SwiGLU activation functionEasy
Submodule 02Backpropagation, Training & Optimization0/18
- 1Implementing Basic Autograd OperationsMed.
- 2Adagrad OptimizerEasy
- 3Adadelta OptimizerMed.
- 4Adam OptimizerMed.
- 5Adamax OptimizerEasy
- 6Momentum OptimizerEasy
- 7Mixed Precision TrainingMed.
- 8A Simple CNN Training Function with BackpropagationHard
- 9RMSProp OptimizerMed.
- 10AdamW Optimizer StepHard
- 11Weight Decay as L2 RegularizationMed.
- 12Temperature Decay SchedulerMed.
- 13Number Format Precision Comparison (FP16 vs BF16 vs FP8 vs FP4)Hard
- 14Neural Memory Update with Surprise and MomentumHard
- 15Muon Optimizer Step with Matrix PreconditioningHard
- 16MuonClip (qk-clip) for Stabilizing AttentionHard
- 17Gradient Clipping by Global NormconceptMed.
- 18Gradient Clipping by ValueconceptEasy
Submodule 03Initialization, Normalization & Regularization0/17
- 1Batch Normalization for BCHW InputEasy
- 2Group NormalizationMed.
- 3Layer Normalization for Sequence DataMed.
- 4Instance Normalization (IN) ImplementationMed.
- 5ExponentialLR Learning Rate SchedulerEasy
- 6He Weight InitializationMed.
- 7He Weight Initialization for Neural NetworksMed.
- 8Xavier/Glorot Weight InitializationMed.
- 9RMSNorm (Root Mean Square Layer Normalization)Med.
- 10Local Response Normalization (LRN)Med.
- 11Spectral NormalizationHard
- 12Pre-Norm vs Post-Norm Transformer BlockHard
- 13QK-Norm (Query-Key Normalization)Hard
- 14Dropout LayerMed.
- 15Regularization via Information BottleneckHard
- 16StepLR Learning Rate SchedulerconceptEasy
- 17Cosine Annealing with Warm RestartsconceptMed.
Submodule 04Sequence Models & Generative Models0/13
- 1Implementing a Simple RNNconceptMed.
- 2GRU CellconceptMed.
- 3Long Short-Term Memory (LSTM) NetworkconceptMed.
- 4TDNN for Variable-Length SequencesconceptMed.
- 5Position-wise Feed-Forward Block with Residual and DropoutconceptEasy
- 6A Simple Residual Block with Shortcut ConnectionconceptEasy
- 7Number of Parameters in Neural NetworkconceptEasy
- 8Train a Simple GAN on 1D Gaussian DataconceptMed.
- 9Variational Autoencoder (VAE) Loss (ELBO)conceptMed.
- 10Variational Inference: ELBO ComputationconceptHard
- 11Triplet Margin LossconceptMed.
- 12Contrastive Loss (InfoNCE / SimCLR-style)conceptMed.
- 13Knowledge Distillation LossconceptMed.