Overview
This project demonstrates a deep understanding of neural network fundamentals by implementing every component of a CNN from the ground up using only NumPy and CuPy for GPU acceleration. The implementation achieves over 82% accuracy on the CIFAR-10 test set, demonstrating competitive performance with modern deep learning frameworks.
Technical Stack
- Core: Python 3.x, NumPy, CuPy (GPU acceleration)
- Visualization: Matplotlib
- Dataset: CIFAR-10 (60,000 RGB images, 32x32 pixels, 10 categories)
Architecture
Convolutional Blocks (3 blocks total):
- Conv layers with 32, 64, and 128 filters respectively (3x3 kernels)
- Batch Normalization after each convolution
- ReLU activation functions
- MaxPooling (2x2) for spatial downsampling
Fully Connected Layers:
- Flatten layer to convert 3D features to 1D
- FC layer: 2048 → 256 neurons
- Dropout (50%) for regularization
- Output layer: 256 → 10 classes
Implementation Highlights
- Custom layer implementations including convolutional layers using im2col optimization
- MaxPooling with forward and backward pass
- Batch Normalization with running statistics
- Dropout regularization and fully connected layers with weight initialization
- One Cycle Learning Rate scheduling
- Data augmentation (horizontal flipping, cutout)
- Weight decay regularization
- Learning rate finder for optimal hyperparameter selection
- im2col and col2im operations for vectorized convolution computation
Results
- Training Accuracy: 86.62%
- Test Accuracy: 82.44%
- Significantly outperforms baseline methods (Nearest Neighbor: 35.39%, Nearest Class Centroid: 27.74%)
Full Implementation Notebook
Below is the complete Jupyter notebook showing the full implementation details, training process, and results.