Skip to content

ArchitJ6/SudokuVision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧩 SudokuVision

SudokuVision is an OCR-powered Sudoku solver that uses cutting-edge machine learning πŸ€– and image processing πŸ“Έ techniques to extract Sudoku grids from images and solve them efficiently. Whether your Sudoku puzzle is handwritten ✍️ or printed πŸ“°, SudokuVision ensures an accurate solution every time! 🎯


✨ Overview

SudokuVision integrates Optical Character Recognition (OCR) for grid extraction 🧠 and deep learning algorithms πŸ” to recognize digits within the grid. The application then uses the PySudoku library 🧩 to solve the puzzle, providing a seamless and fast solution ⏱️.

  • OCR-Based Sudoku Grid Detection πŸ“Έ
  • Digit Recognition Using Deep Learning πŸ€–
  • PySudoku-Based Solver 🧩
  • Works with Printed and Handwritten Sudoku Puzzles ✍️
  • Supports PNG, JPG, JPEG Images πŸ–ΌοΈ

✨ Features

  • Grid Extraction: Automatically extracts Sudoku grids from images using advanced image processing techniques 🎯.
  • Digit Recognition: Identifies digits within the grid using deep learning models πŸ”’.
  • Fast Solver: Solves puzzles using the PySudoku backtracking solver ⏱️.
  • Command Line and Web Interface: Provides both command-line and Streamlit-based web interface for ease of use πŸ–₯️.
  • Multiple Image Formats: Works with various image formats like PNG, JPG, and JPEG πŸ“Έ.
  • Real-time Visualization: Displays the solved Sudoku puzzle directly in the web app 🌐 and on the command line interface using OpenCV πŸ’‘.

πŸ“Š Datasets Used

The following datasets were used for training the deep learning model for digit recognition and grid extraction:

πŸ§‘β€πŸ« 1. MNIST Dataset

The MNIST dataset, containing 60,000 handwritten digit images πŸ–‹οΈ, was used to train the deep learning model for digit recognition. The dataset includes grayscale images πŸ–€ of size 28x28 pixels and is ideal for training models to recognize handwritten digits.

  • Preprocessing:
    • Grayscale normalization to a range of [0, 1] πŸŒ‘.
    • Reshaped to 28x28 pixels πŸ–ΌοΈ.

πŸ”  2. Chars74K Dataset

The Chars74K dataset contains images of characters in various fonts πŸ” , including digits, used to supplement the training process with diverse digital font variations πŸ…°οΈ.

  • Preprocessing:
    • Resized to 28x28 pixels πŸ–ΌοΈ.
    • Grayscale conversion and normalization to a range of [0, 1] πŸŒ‘.

This dataset enhances the model's ability to recognize digits in digital fonts, improving accuracy across various types of input πŸ“.

✍️ 3. TMNIST Dataset

The TMNIST dataset is another handwritten digit dataset used to further train and diversify the digit recognition capabilities πŸ€–. It contains images in the same format as MNIST and was used to train the model on additional handwritten digits ✍️.

  • Preprocessing:
    • Data is scaled to a range of [0, 1] πŸŒ‘.
    • Labels are encoded using LabelEncoder πŸ”£ and converted to categorical values πŸ“Š.

πŸ” How It Works

  1. Puzzle Extraction 🧩
    The uploaded image is processed using OpenCV πŸ–ΌοΈ for grid extraction. The grid's edges are detected, and the puzzle is segmented into individual cells 🏷️.

  2. Digit Recognition πŸ”’
    Each individual cell in the grid is processed by a deep learning model that recognizes the digits 🧠. The model is trained on the MNIST, Chars74K, and TMNIST datasets πŸ“Š.

  3. Puzzle Solving 🧩
    Once the digits are identified, they are passed to the PySudoku solver, which uses a backtracking algorithm πŸ”„ to solve the puzzle 🧩.

  4. Result Display πŸŽ₯
    The original and solved puzzles are displayed:

    • Web App 🌐: The result is shown directly in the browser 🌍.
    • Command Line πŸ’»: The solved puzzle is displayed directly in the terminal using OpenCV πŸ–ΌοΈ. The result is visualized without saving it to a file, using cv2.imshow() to show the solved puzzle 🧩.

βš™οΈ Setup and Installation

To set up SudokuVision 🧩 on your local machine πŸ’», follow the instructions below:

πŸ“ 1. Clone the repository

git clone https://github.com/ArchitJ6/SudokuVision.git  
cd SudokuVision  

πŸ“¦ 2. Install Dependencies

Create a virtual environment (recommended) 🌱 and install required packages:

python -m venv venv  
source venv/bin/activate  # On Windows use `venv\Scripts\activate`  
pip install -r requirements.txt  

πŸ“₯ 3. Download the Datasets

Make sure to download the Chars74K, MNIST, and TMNIST datasets. The datasets should be organized in the following directory structure:

/datasets  
    /Chars74K-Digital-English-Font  
        Extract the files and place the folders for digits 0 to 9 here, each containing images of the corresponding digit (labeled accordingly).  
    /tmnist  
        This dataset contains a `data.csv` file with the data for handwritten digits.  
  • MNIST: The MNIST dataset will be used directly from Keras.
  • Chars74K-Digital-English-Font: Extract the files and organize them into folders for digits 0 to 9, with images of each digit placed inside the corresponding folder, labeled by the digit.
  • TMNIST: This dataset includes a data.csv file that contains the data for handwritten digits.

πŸ§‘β€πŸ’» Usage

🌐 1. Streamlit Interface

To run the web interface using Streamlit πŸ–₯️, follow these steps:

  1. Run the Streamlit app:
streamlit run app.py  
  1. Upload the Image πŸ–ΌοΈ:

    • After the app starts, open the URL provided by Streamlit 🌐.
    • Upload an image of the Sudoku puzzle (printed or handwritten) 🧩.
    • Click "Solve Sudoku" 🧠 to process and get the solution 🧩.
  2. Output πŸŽ₯:
    The original puzzle with solved values will be displayed directly on the web interface 🌐.

πŸ’» 2. Command-Line Interface (CLI)

To use the command-line interface πŸ’»:

  1. Run the script:
python solve.py --image <path_to_image> --debug -1  
  • --image: Path to the Sudoku image πŸ–ΌοΈ.
  • --debug: Set to 1 for debug mode πŸ› οΈ, which visualizes the grid and digit extraction process πŸ”.
  1. Output πŸŽ₯:
    The solved Sudoku puzzle 🧩 will be displayed directly in the new window using OpenCV πŸ–ΌοΈ. The window will automatically close when any key is pressed ⏳.

πŸ“š 3. Model Training

To train the model for digit recognition 🧠, use the following script:

python train_model.py  

This will load the datasets πŸ“Š, preprocess the data πŸ”„, train the model πŸ€–, and save the trained model for future use πŸ’Ύ.


πŸ’‘ Best Practices for Usage

To get the most accurate results, keep the following tips in mind:

  • Ensure good lighting when capturing handwritten Sudoku puzzles ✍️ for optimal digit recognition.
  • Use high-resolution images πŸ–ΌοΈ for better grid and digit extraction.
  • For handwritten puzzles, maintain legibility of digits for improved accuracy ✍️.

πŸ”§ Troubleshooting

If you run into issues, check these common solutions:

  • Missing Dependencies: Make sure all packages are installed by running pip install -r requirements.txt πŸ“¦.
  • Image Processing Errors: Ensure that the uploaded image is clear and contains a proper Sudoku grid πŸ“Έ.
  • Solver Not Working: Make sure the digits are clearly detected by checking the debug output with the --debug flag πŸ› οΈ.

πŸ”’ Security and Privacy

Your uploaded images are processed locally and are not stored long-term. We respect your privacy and ensure that no sensitive information is exposed during the image upload and processing process πŸ”.


🀝 How to Contribute

We welcome contributions! πŸŽ‰ To contribute to SudokuVision 🧩, follow these steps:

  1. Fork the repository 🍴.
  2. Create a new branch (git checkout -b feature-name) 🌱.
  3. Make your changes ✍️.
  4. Commit your changes (git commit -m 'Add feature') πŸ’¬.
  5. Push to the branch (git push origin feature-name) πŸš€.
  6. Open a pull request with a description of your changes πŸ“„.

πŸ™ Acknowledgments

  • PySudoku Library 🧩: For providing an efficient backtracking-based solver.
  • MNIST πŸ“š: For the dataset used for training the digit recognition model.
  • Chars74K πŸ” : For the dataset of digital fonts, enriching the model's ability to recognize various types of digits.
  • TMNIST ✍️: For further diversifying the training data and enhancing recognition accuracy.

πŸ“œ License

This project is licensed under the MIT License βš–οΈ.

About

SudokuVision is an OCR-powered Sudoku solver that extracts grids from images using machine learning and image processing techniques. It supports both printed and handwritten Sudoku puzzles, providing fast and accurate solutions via a web interface and command-line tool.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages