Comprehensive implementation of Support Vector Machines and Deep Autoencoders for image classification and dimensionality reduction on CIFAR-10 and MNIST datasets, built from scratch using NumPy, CuPy, and PyTorch.
github.com/pompos02/NeuralNetworks-DeepLearningThis project consists of two major assignments demonstrating the implementation of classical machine learning algorithms and modern deep learning approaches. Assignment 2 focuses on Support Vector Machines with different kernels and optimization techniques, while Assignment 3 explores autoencoders for dimensionality reduction and image reconstruction.
Implementation of SVM algorithms from scratch for binary classification of CIFAR-10 dataset (cats vs dogs), using only NumPy and CuPy.
Implemented using Hinge Loss with L2 regularization and Stochastic Gradient Descent optimization. Features include GPU acceleration with CuPy, learning rate scheduling, standardization, and data augmentation.
Initial Linear SVM results showing training progress and confusion matrix
Configuration | Train Acc | Test Acc |
---|---|---|
Basic Implementation | 59.06% | 58.55% |
With LR Scheduler | 56.74% | 56.60% |
+ Data Augmentation | 60.53% | 60.55% |
Final Optimized | 63.69% | 63.40% |
Final optimized Linear SVM with hyperparameter tuning
Using polynomial kernel K(x,y) = (x·y + c)^d, implemented in dual form with gradient ascent optimization.
Hyperparameters | Train Acc | Test Acc |
---|---|---|
Degree=3, C=3.0 | 63.38% | 62.60% |
Best: Degree=2, C=10 (with tuning) | 73.49% | 64.25% |
Comprehensive grid search results showing optimal hyperparameter combinations
Using RBF kernel K(x,y) = exp(-γ||x-y||²) with SMO algorithm for efficient training. Achieved the best overall performance among all SVM variants.
Implementation | Train Acc | Test Acc | Time |
---|---|---|---|
RBF SMO (Optimized) | 89.91% | 65.70% | 102 mins |
RBF SMO (Basic) | 68.54% | 64.25% | Long |
RBF SVM training curves showing convergence and final performance metrics
Implementation and comparison of autoencoder architectures for MNIST digit reconstruction, with evaluation against PCA and classification performance analysis using a CNN.
Architecture: 784 → 128 → 32 → 128 → 784 with ReLU activation (hidden) and Sigmoid (output). Trained using MSE loss.
Results: Train Loss: 0.0103, Test Loss: 0.0099
Basic Autoencoder training loss progression
Sample digit reconstructions from Basic Autoencoder
Enhanced architecture: 784 → 512 → 256 → 64 → 256 → 512 → 784 with deeper layers for better feature learning.
Improved Results: Train Loss: 0.0045, Test Loss: 0.0047 (50% reduction in MSE)
Deep Autoencoder showing superior convergence and lower loss
t-SNE visualization of the 32D and 64D latent spaces reduced to 2D, showing clear digit clustering patterns.
t-SNE visualization showing distinct digit clusters in latent space
Method | Test MSE | Bottleneck Size |
---|---|---|
Deep Autoencoder | 0.0047 | 64 |
Basic Autoencoder (64) | 0.0063 | 64 |
PCA Reconstruction | 0.0090 | 64 |
Side-by-side comparison of PCA and Autoencoder reconstructions
Implemented a CNN to classify MNIST digits using three different inputs: original data, autoencoder reconstructions, and PCA reconstructions.
CNN training and validation accuracy/loss curves showing excellent convergence
Input Type | CNN Accuracy | Performance Drop |
---|---|---|
Original MNIST | 98.89% | Baseline |
Deep AE Reconstruction | 98.27% | -0.62% |
PCA Reconstruction | 98.22% | -0.67% |
Analysis of reconstruction error by digit class reveals which digits are most challenging for the models.
Basic Autoencoder per-digit loss
Deep Autoencoder per-digit performance
Algorithm | Test Accuracy | Complexity |
---|---|---|
RBF SVM (SMO) | 65.70% | High |
Polynomial SVM | 64.25% | Medium |
Linear SVM | 63.40% | Low |
Linear SMO | 61.90% | Medium |
K-NN (k=1) | 57.85% | Low |
Model | Metric | Performance |
---|---|---|
CNN on Original | Accuracy | 98.89% |
Deep Autoencoder | MSE Loss | 0.0047 |
Basic Autoencoder | MSE Loss | 0.0099 |
PCA (64 components) | MSE Loss | 0.0090 |
CNN on Deep AE Reconstruction | Accuracy | 98.27% |
CNN on PCA Reconstruction | Accuracy | 98.22% |
Deeper autoencoders significantly outperform shallow ones (50% MSE reduction). Non-linear mappings (autoencoders) superior to linear methods (PCA) for complex image reconstruction.
Minimal accuracy drop (<1%) when classifying reconstructed images demonstrates excellent information preservation across all methods.
RBF SVM achieves highest accuracy (65.70%) but requires extensive training time (102+ minutes). Linear methods provide excellent baseline performance with fast training.
Classical methods (PCA, Linear SVM) provide interpretable and fast baselines. Modern methods (Deep AE, RBF SVM) achieve superior performance with increased complexity.