SudokuVision is an OCR-powered Sudoku solver that uses cutting-edge machine learning π€ and image processing πΈ techniques to extract Sudoku grids from images and solve them efficiently. Whether your Sudoku puzzle is handwritten βοΈ or printed π°, SudokuVision ensures an accurate solution every time! π―
SudokuVision integrates Optical Character Recognition (OCR) for grid extraction π§ and deep learning algorithms π to recognize digits within the grid. The application then uses the PySudoku library π§© to solve the puzzle, providing a seamless and fast solution β±οΈ.
- OCR-Based Sudoku Grid Detection πΈ
- Digit Recognition Using Deep Learning π€
- PySudoku-Based Solver π§©
- Works with Printed and Handwritten Sudoku Puzzles βοΈ
- Supports PNG, JPG, JPEG Images πΌοΈ
- Grid Extraction: Automatically extracts Sudoku grids from images using advanced image processing techniques π―.
- Digit Recognition: Identifies digits within the grid using deep learning models π’.
- Fast Solver: Solves puzzles using the PySudoku backtracking solver β±οΈ.
- Command Line and Web Interface: Provides both command-line and Streamlit-based web interface for ease of use π₯οΈ.
- Multiple Image Formats: Works with various image formats like PNG, JPG, and JPEG πΈ.
- Real-time Visualization: Displays the solved Sudoku puzzle directly in the web app π and on the command line interface using OpenCV π‘.
The following datasets were used for training the deep learning model for digit recognition and grid extraction:
The MNIST dataset, containing 60,000 handwritten digit images ποΈ, was used to train the deep learning model for digit recognition. The dataset includes grayscale images π€ of size 28x28 pixels and is ideal for training models to recognize handwritten digits.
- Preprocessing:
- Grayscale normalization to a range of [0, 1] π.
- Reshaped to 28x28 pixels πΌοΈ.
The Chars74K dataset contains images of characters in various fonts π , including digits, used to supplement the training process with diverse digital font variations
- Preprocessing:
- Resized to 28x28 pixels πΌοΈ.
- Grayscale conversion and normalization to a range of [0, 1] π.
This dataset enhances the model's ability to recognize digits in digital fonts, improving accuracy across various types of input π.
The TMNIST dataset is another handwritten digit dataset used to further train and diversify the digit recognition capabilities π€. It contains images in the same format as MNIST and was used to train the model on additional handwritten digits βοΈ.
- Preprocessing:
- Data is scaled to a range of [0, 1] π.
- Labels are encoded using LabelEncoder π£ and converted to categorical values π.
-
Puzzle Extraction π§©
The uploaded image is processed using OpenCV πΌοΈ for grid extraction. The grid's edges are detected, and the puzzle is segmented into individual cells π·οΈ. -
Digit Recognition π’
Each individual cell in the grid is processed by a deep learning model that recognizes the digits π§ . The model is trained on the MNIST, Chars74K, and TMNIST datasets π. -
Puzzle Solving π§©
Once the digits are identified, they are passed to the PySudoku solver, which uses a backtracking algorithm π to solve the puzzle π§©. -
Result Display π₯
The original and solved puzzles are displayed:- Web App π: The result is shown directly in the browser π.
- Command Line π»: The solved puzzle is displayed directly in the terminal using OpenCV πΌοΈ. The result is visualized without saving it to a file, using
cv2.imshow()to show the solved puzzle π§©.
To set up SudokuVision π§© on your local machine π», follow the instructions below:
git clone https://github.com/ArchitJ6/SudokuVision.git
cd SudokuVision Create a virtual environment (recommended) π± and install required packages:
python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
pip install -r requirements.txt Make sure to download the Chars74K, MNIST, and TMNIST datasets. The datasets should be organized in the following directory structure:
/datasets
/Chars74K-Digital-English-Font
Extract the files and place the folders for digits 0 to 9 here, each containing images of the corresponding digit (labeled accordingly).
/tmnist
This dataset contains a `data.csv` file with the data for handwritten digits.
- MNIST: The MNIST dataset will be used directly from Keras.
- Chars74K-Digital-English-Font: Extract the files and organize them into folders for digits 0 to 9, with images of each digit placed inside the corresponding folder, labeled by the digit.
- TMNIST: This dataset includes a
data.csvfile that contains the data for handwritten digits.
To run the web interface using Streamlit π₯οΈ, follow these steps:
- Run the Streamlit app:
streamlit run app.py -
Upload the Image πΌοΈ:
- After the app starts, open the URL provided by Streamlit π.
- Upload an image of the Sudoku puzzle (printed or handwritten) π§©.
- Click "Solve Sudoku" π§ to process and get the solution π§©.
-
Output π₯:
The original puzzle with solved values will be displayed directly on the web interface π.
To use the command-line interface π»:
- Run the script:
python solve.py --image <path_to_image> --debug -1 --image: Path to the Sudoku image πΌοΈ.--debug: Set to1for debug mode π οΈ, which visualizes the grid and digit extraction process π.
- Output π₯:
The solved Sudoku puzzle π§© will be displayed directly in the new window using OpenCV πΌοΈ. The window will automatically close when any key is pressed β³.
To train the model for digit recognition π§ , use the following script:
python train_model.py This will load the datasets π, preprocess the data π, train the model π€, and save the trained model for future use πΎ.
To get the most accurate results, keep the following tips in mind:
- Ensure good lighting when capturing handwritten Sudoku puzzles βοΈ for optimal digit recognition.
- Use high-resolution images πΌοΈ for better grid and digit extraction.
- For handwritten puzzles, maintain legibility of digits for improved accuracy βοΈ.
If you run into issues, check these common solutions:
- Missing Dependencies: Make sure all packages are installed by running
pip install -r requirements.txtπ¦. - Image Processing Errors: Ensure that the uploaded image is clear and contains a proper Sudoku grid πΈ.
- Solver Not Working: Make sure the digits are clearly detected by checking the debug output with the
--debugflag π οΈ.
Your uploaded images are processed locally and are not stored long-term. We respect your privacy and ensure that no sensitive information is exposed during the image upload and processing process π.
We welcome contributions! π To contribute to SudokuVision π§©, follow these steps:
- Fork the repository π΄.
- Create a new branch (
git checkout -b feature-name) π±. - Make your changes βοΈ.
- Commit your changes (
git commit -m 'Add feature') π¬. - Push to the branch (
git push origin feature-name) π. - Open a pull request with a description of your changes π.
- PySudoku Library π§©: For providing an efficient backtracking-based solver.
- MNIST π: For the dataset used for training the digit recognition model.
- Chars74K π : For the dataset of digital fonts, enriching the model's ability to recognize various types of digits.
- TMNIST βοΈ: For further diversifying the training data and enhancing recognition accuracy.
This project is licensed under the MIT License βοΈ.