Overview
This project was built to understand OCR and neural networks below the framework level. Instead of relying only on TensorFlow, the project includes custom NumPy implementations for forward propagation, backpropagation, activation functions, convolution, pooling, feature extraction, and model serialization, with TensorFlow used as a comparison baseline.
Key Features
- Interactive Streamlit interface for drawing and recognizing digits, letters, words, and sentences.
- Custom ANN pipeline using handcrafted Sobel edge-segment features from MNIST-style images.
- Manual CNN components including convolution, ReLU activation, max pooling, flattening, and dense layers.
- Side-by-side predictions from custom ANN, custom CNN, and TensorFlow CNN models.
- Visual preprocessing output so model behavior can be inspected instead of treated as a black box.
Evidence
The project compares low-level model behavior against a TensorFlow CNN baseline, making accuracy gaps, feature engineering limitations, and convolutional architecture advantages easier to explain.
What I Learned
Building the model components manually made the tradeoffs behind CNNs more concrete: handcrafted edge features are fragile, activation functions can saturate or destabilize training, and learned convolutional filters solve image-recognition problems that shallow feature pipelines struggle with.