#

math-reasoning

Here are 32 public repositories matching this topic...

reasoning-from-scratch

rasbt / reasoning-from-scratch

Implement a reasoning LLM in PyTorch from scratch, step by step

python machine-learning reinforcement-learning ai deep-learning pytorch artificial-intelligence reasoning distillation large-language-models llm llms chain-of-thought rlhf test-time-compute reasoning-models grpo math-reasoning inference-time-scaling

Updated Jun 12, 2026
Jupyter Notebook

YutingLi0606 / Vision-Matters

(ArXiv25) Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning

mllm mllm-reasoning math-reasoning

Updated Sep 30, 2025
Python

lupantech / ineqmath

Solving Inequality Proofs with Large Language Models.

theorem-proving inequality olympiad llms llm-as-a-judge math-reasoning

Updated Dec 15, 2025
Python

InternLM / Spark

An official implementation of "SPARK: Synergistic Policy And Reward Co-Evolving Framework"

self-improvement multi-modal large-language-models vision-language-model reward-model large-vision-language-models self-rewarding math-reasoning

Updated Oct 23, 2025
Python

he-yufeng / gsm8k-prompt-engineering

Prompt-engineering study on LLM math reasoning (GSM8K) and code generation (HumanEval): zero/few-shot, self-consistency, self-verification, and experiments on prompt quality, complexity, demonstrations, and diversity.

nlp code-generation self-consistency few-shot-learning humaneval llm prompt-engineering chain-of-thought gsm8k math-reasoning

Updated Jun 27, 2026
Python

Arash-ra03 / ArmLLM

ArmLLM 2025 solutions covering ViT from scratch, SigLIP–Qwen LaTeX OCR, GRPO reasoning post-training, inference-time reasoning strategies, and adversarial vision attacks.

reinforcement-learning computer-vision transformers pytorch multimodal-learning post-training adversarial-attacks adversarial-robustness vision-transformer latex-ocr vision-language-model qwen siglip inference-time-compute grpo math-reasoning armllm llm-summer-school

Updated Nov 26, 2025
Jupyter Notebook

wd041216-bit / ai-benchmark-kb

AI Benchmark 知识库 — 全面收录各大 AI 公司用来测试模型性能的 Benchmark 题库完整集合

benchmark knowledge-base model-evaluation reasoning multimodal ai-benchmarks instruction-following llm long-context safety-evaluation ai-performance math-reasoning coding-benchmark benchmark-collection eval-frameworks

Updated Apr 16, 2026

goblinasaddy / nanoJEPA

A minimal JEPA-based language model demonstrating latent-space reasoning on GSM8K using a single decoder-only Transformer.

deep-learning pytorch transformer research-project representation-learning language-model latent-space gsm8k jepa math-reasoning

Updated Feb 28, 2026
Python

Seanaaa0 / QT-R1

STaR × S1 math pipeline on Qwen2.5-1.5B. LoRA, strict Final: format, ~20–30% acc (OpenR1-Math split).

transformers star dataset-pipeline qlora peft-fine-tuning-llm qwen2-5 math-reasoning openr1-math

Updated Sep 6, 2025
Python

mianhua157 / math-data-cleaning-qwen

Data cleaning and structuring pipeline for math reasoning tasks using Qwen3-0.6B for LLM post-training.

nlp machine-learning transformers pytorch data-processing data-cleaning post-training llm qwen math-reasoning

Updated Apr 13, 2026
Python

wsdjzlh / math-process-supervision-qwen

A controlled LoRA finetuning study on process supervision for mathematical reasoning with Qwen2.5-Math-7B-Instruct.

lora process-supervision llm gsm8k qwen math-reasoning

Updated Apr 23, 2026
Python

rishabhsai / math-self-doubt

Paired-framing pilot on observable self-doubt in math reasoning models

ai-safety mechanistic-interpretability llm-evals math-reasoning model-evals

Updated Jun 11, 2026
Python

vudang4494 / xling-grpo-sub3b

Beyond English-Only GRPO: a multi-seed controlled study of training-language and auxiliary-reward effects in sub-3B math reasoning (GRPO + LoRA, single GPU).

multilingual reinforcement-learning lora reproducibility cross-lingual llm qwen mgsm grpo math-reasoning

Updated Jun 30, 2026
Python

KaiP-598 / grpo-from-scratch

GRPO (Group Relative Policy Optimization) implemented from scratch in PyTorch. 10 ablation experiments.

training reinforcement-learning pytorch from-scratch llm rlhf vllm deepseek-r1 grpo math-reasoning

Updated Apr 26, 2026
Python

ashioyajotham / spot-the-flaw

Verifier-backed math reasoning lab for proof flaw localization, minimal repair, and RL-style evaluation.

reinforcement-learning gemini process-supervision proof-checking verifier reinforcement-learning-environments reinfrocement-learning proof-repair llm-evaluation proof-verification math-reasoning

Updated Jun 11, 2026
Python

handsomeZR-netizen / nnu-nlp-gsm8k-coursework-2026

NLP course final project (2026), Nanjing Normal University, supervised by 孔力: GSM8K math QA with Seq2Seq, Transformer and LLMs.

nlp transformer seq2seq coursework llm gsm8k nnu math-reasoning

Updated Jun 24, 2026
Python

EladMoshe98 / My_MSc_Thesis_on_LLM_interpretability

MSc capstone: mechanistic interpretability of Chain-of-Thought reasoning in LLMs via SAE features and logprob signals

nlp research transformers jupyter-notebooks gemma interpretability sparse-autoencoder feature-analysis large-language-models chain-of-thought mechanistic-interpretability causal-mediation math-reasoning logprobs

Updated Jun 29, 2026
Jupyter Notebook

hoadm-net / MathCoRL

Comprehensive framework for mathematical reasoning research with dual research capabilities

nlp prompt-engineering math-reasoning

Updated Mar 18, 2026
Python

huyxdang / RLVR-Decomposed-Implementation

Small-scale Implementation and Extension of “The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning” (NeurIPS '25)

reinforcement-learning llm math-reasoning

Updated Oct 23, 2025
Python

StaryMoon / LIMO-Unofficial

Unofficial PyTorch reproduction for LIMO: Less is More for Reasoning.

pytorch reproduction reasoning sft data-efficient-learning math-reasoning unofficial-implementation

Updated Jun 10, 2026
Python

Improve this page

Add a description, image, and links to the math-reasoning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the math-reasoning topic, visit your repo's landing page and select "manage topics."