CMNIST

A from scratch implementation of MNIST digit classification in C.

CMNIST is a minimal deep learning project that implements a Multi-Layer Perceptron (MLP), automatic differentiation, and training loop entirely in C - without using any external ML libraries.

Features

MNIST Dataset file parsing (idx3-ubyte, idx1-ubyte)
Custom Value structure
Reverse-mode automatic differentiation
Multi-Layer Perception implementation
Forward + Backward propagation
SGD (Stochastic Gradient Descent) based training loop

Model Description

The model consist of 3 simple layers:

Input Layer : 32 neurons, each with 784 inputs (28x28 flattened image data)
Intermediate Layer : 16 neurons, each taking 32 previous neuron activations as inputs
Output Layer : 10 neurons, each corresponding to each digit classification

Activation: The first two layers have hyperbolic tangent (tanh) as the activation function, while the output layer has none

Loss: Sum of squared deviations of each sample in a batch are taken as loss. Apparently adding scaling based on batch size to implement "mean" squared error caused explosion in loss.

Current accuracy on the test sample is ~95.12%

Compile instructions

For training executable, run the following:

gcc train.c train_utils.c fileSystem.c neuron.c neuron_utils.c -o train
./train

For testing executable, run the following:

gcc main.c train_utils.c fileSystem.c neuron.c neuron_utils.c -o test
./test

Implementation Details

MNIST Parsing

Reads raw binary IDX format
Performs manual endianness conversion
Allocates memory dynamically for images and labels

Autograd Engine
Each Valuenode contains:

data
grad
_prev (dependencies)
_backward function pointer
name for debugging
operation metadata
visited flag for graph traversal
_isparameter for handling memory allocation

Backward pass is performed by recursive graph traversal.

Upon completion of a backward pass, memory allocated to non-paremeter value nodes that are generated during forward pass needs to be freed using freeComputationTree()

Known Issues

For large batch-size (~100) the loss function explodes
Averaging loss function over a batch, instead of adding, leads to loss explosion as well
Slow speed for large layers (originally intended MLP structure took hours on a training loop)

Inspiration

After watching this video by bigboxSWE I picked this project - usually implemented in python - instead doing it in C
Andrej Karpathy's intro to neural networks youtube video

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
dataset		dataset
.DS_Store		.DS_Store
.gitignore		.gitignore
.txt		.txt
README.md		README.md
fileSystem.c		fileSystem.c
fileSystem.h		fileSystem.h
main.c		main.c
model.txt		model.txt
neuron.c		neuron.c
neuron.h		neuron.h
neuron_utils.c		neuron_utils.c
neuron_utils.h		neuron_utils.h
test		test
train		train
train.c		train.c
train_utils.c		train_utils.c
train_utils.h		train_utils.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CMNIST

Features

Model Description

Compile instructions

Implementation Details

Known Issues

Inspiration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CMNIST

Features

Model Description

Compile instructions

Implementation Details

Known Issues

Inspiration

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages