Skip to content

Beginner10617/CMNIST

Repository files navigation

CMNIST

A from scratch implementation of MNIST digit classification in C.

CMNIST is a minimal deep learning project that implements a Multi-Layer Perceptron (MLP), automatic differentiation, and training loop entirely in C - without using any external ML libraries.


Features

  • MNIST Dataset file parsing (idx3-ubyte, idx1-ubyte)
  • Custom Value structure
  • Reverse-mode automatic differentiation
  • Multi-Layer Perception implementation
  • Forward + Backward propagation
  • SGD (Stochastic Gradient Descent) based training loop

Model Description

The model consist of 3 simple layers:

  1. Input Layer : 32 neurons, each with 784 inputs (28x28 flattened image data)
  2. Intermediate Layer : 16 neurons, each taking 32 previous neuron activations as inputs
  3. Output Layer : 10 neurons, each corresponding to each digit classification

Activation: The first two layers have hyperbolic tangent (tanh) as the activation function, while the output layer has none

Loss: Sum of squared deviations of each sample in a batch are taken as loss. Apparently adding scaling based on batch size to implement "mean" squared error caused explosion in loss.

Current accuracy on the test sample is ~95.12%


Compile instructions

For training executable, run the following:

gcc train.c train_utils.c fileSystem.c neuron.c neuron_utils.c -o train
./train

For testing executable, run the following:

gcc main.c train_utils.c fileSystem.c neuron.c neuron_utils.c -o test
./test

Implementation Details

MNIST Parsing

  • Reads raw binary IDX format
  • Performs manual endianness conversion
  • Allocates memory dynamically for images and labels

Autograd Engine
Each Valuenode contains:

  • data
  • grad
  • _prev (dependencies)
  • _backward function pointer
  • name for debugging
  • operation metadata
  • visited flag for graph traversal
  • _isparameter for handling memory allocation

Backward pass is performed by recursive graph traversal.

Upon completion of a backward pass, memory allocated to non-paremeter value nodes that are generated during forward pass needs to be freed using freeComputationTree()


Known Issues

  • For large batch-size (~100) the loss function explodes
  • Averaging loss function over a batch, instead of adding, leads to loss explosion as well
  • Slow speed for large layers (originally intended MLP structure took hours on a training loop)

Inspiration

  • After watching this video by bigboxSWE I picked this project - usually implemented in python - instead doing it in C
  • Andrej Karpathy's intro to neural networks youtube video

About

A from-scratch MNIST handwritten digit classifier implemented in C

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages