Author: Moh Rafik
Goal: Master the theory and implementation of Deep Learning — from Neural Networks to Transformers.
This repository is part of the AI Learning Roadmap (3-Month Intensive) that includes:
- Machine Learning Basics
- Deep Learning Foundations ← (You are here)
- Generative AI Projects
This repository focuses on:
- Understanding the fundamentals of Neural Networks
- Implementing deep learning architectures from scratch and using PyTorch/TensorFlow
- Exploring CNNs, RNNs, LSTMs, and Transformers
- Building practical deep learning projects for vision and NLP
deep-learning-foundations/
│
├── 01_neural_networks/
│ ├── perceptron_from_scratch.ipynb
│ ├── backpropagation_numpy.ipynb
│ ├── activation_functions_visualization.ipynb
│
├── 02_cnn_architectures/
│ ├── cnn_mnist_pytorch.ipynb
│ ├── resnet_cifar10.ipynb
│ ├── vgg_transfer_learning.ipynb
│
├── 03_rnn_lstm/
│ ├── rnn_text_generation.ipynb
│ ├── lstm_sentiment_analysis.ipynb
│
├── 04_transformers_intro/
│ ├── attention_mechanism.ipynb
│ ├── mini_transformer_from_scratch.ipynb
│
├── projects/
│ ├── image_classification_cnn.ipynb
│ ├── nlp_sentiment_project.ipynb
│
├── assets/
│ ├── images/
│ └── figures/
│
└── README.md
| Week | Focus | Topics |
|---|---|---|
| 5 | Neural Network Fundamentals | Perceptron, MLPs, Activation Functions, Backpropagation |
| 6 | Convolutional Neural Networks | CNN basics, Filters, Pooling, Architectures (LeNet, VGG, ResNet) |
| 7 | Recurrent Neural Networks | RNNs, LSTMs, GRUs, Sequence modeling |
| 8 | Transformers | Attention Mechanisms, Encoder-Decoder, Self-Attention |
- Concept: Composed of layers of interconnected nodes that transform input features.
- Mathematics:
- Linear combination: ( z = w^Tx + b )
- Activation functions introduce non-linearity (ReLU, Sigmoid, Tanh, Softmax).
- Backpropagation updates weights via gradient descent.
- Purpose: Capture spatial hierarchies in image data.
- Key Components:
- Convolution Layer — feature extraction using filters.
- Pooling Layer — downsampling to reduce dimensionality.
- Fully Connected Layer — combines extracted features for classification.
- Famous Architectures: LeNet, AlexNet, VGG, ResNet.
- Purpose: Handle sequential data where order matters.
- Key Concept: Hidden state captures previous time-step information.
- Challenges: Vanishing gradients solved by LSTM/GRU architectures.
- Core Idea: Attention allows the model to weigh importance of input elements.
- Self-Attention: Computes attention scores between words in the same sequence.
- Architecture: Encoder-Decoder framework (used in BERT, GPT).
NumPy– basic neural network from scratchPyTorchorTensorFlow– deep learning frameworkstorchvision/keras.datasets– datasets (MNIST, CIFAR-10, IMDB)
- MLP from scratch using NumPy
- CNNs using PyTorch (MNIST, CIFAR-10)
- RNN/LSTM for sequence data (text generation)
- Transformer mini-model with attention visualization
Each notebook includes:
- Theoretical explanations
- Code with inline comments
- Visualization of training curves
- Model evaluation and performance plots
Goal: Classify images using convolutional neural networks.
Concepts: CNN, Batch Normalization, Dropout, ResNet transfer learning.
Dataset: CIFAR-10 or MNIST.
Deliverables:
- CNN model from scratch.
- Performance comparison with pre-trained ResNet.
Goal: Predict movie review sentiment (positive/negative).
Concepts: RNN, LSTM, Embedding Layers.
Dataset: IMDB Reviews (Keras/TorchText).
Deliverables:
- Preprocessing and tokenization notebook.
- LSTM model implementation.
- Accuracy and confusion matrix visualizations.
Goal: Implement and visualize self-attention mechanism.
Concepts: Scaled Dot-Product Attention, Multi-Head Attention, Positional Encoding.
Deliverables:
- Build attention from scratch using PyTorch.
- Visualize attention weights.
| Model | Dataset | Accuracy | Notes |
|---|---|---|---|
| CNN (Custom) | MNIST | 98% | Strong generalization |
| ResNet (Transfer Learning) | CIFAR-10 | 92% | Pre-trained advantage |
| LSTM | IMDB | 88% | Context captured well |
| Mini Transformer | Custom Text | N/A | Visual attention verified |
- Mastered backpropagation and gradient-based optimization.
- Understood architectures (CNNs, RNNs, Transformers).
- Learned to use GPU acceleration and data loaders in PyTorch.
- Built modular, reusable model training pipelines.
➡️ Proceed to Generative AI Projects to explore VAEs, GANs, Diffusion Models, and LLMs.
- Python 3.9+
- Jupyter Notebook / VS Code
- Libraries:
numpy,torch,torchvision,matplotlib,transformers
- MIT 6.S191 — Introduction to Deep Learning
- Ian Goodfellow — Deep Learning
- PyTorch Documentation — https://pytorch.org/tutorials
- Stanford CS231n — Convolutional Neural Networks for Visual Recognition
| Task | Status |
|---|---|
| Implement perceptron and MLP from scratch | ☐ |
| Train CNN on MNIST | ☐ |
| Implement ResNet (transfer learning) | ☐ |
| Implement RNN/LSTM for text generation | ☐ |
| Implement Mini Transformer | ☐ |
| Complete all 3 projects | ☐ |
| Documentation and Git push | ☐ |
⭐ Pro Tip:
Visualize intermediate layer activations and attention maps. Include training logs and plots in your assets folder for better presentation.
📌 Maintained by Moh Rafik
💬 For queries or collaborations: [RAFIKIITBHU@GMAIL.COM or LinkedIn]