Kimari Local AI

Run useful local AI on older NVIDIA GPUs.
Local-first · Open-source · GGUF runtime · Gateway dashboard · Agent-ready

What is Kimari?

Kimari Local AI is an open-source framework for running local language models on consumer NVIDIA GPUs, especially older cards such as the GTX 1060 6GB and GTX 1080 8GB.

Kimari provides:

a CLI-first local AI workflow;
GPU-aware profiles;
llama.cpp / GGUF runtime support;
an OpenAI-compatible local endpoint;
a Gateway Dashboard;
local integration helpers for Open WebUI, Continue.dev, OpenClaw and Hermes;
private training/evaluation infrastructure for future Kimari models.

Status: alpha software. Useful today, but not production-ready.

Current Truth

Kimari today is:

A local AI framework and CLI.
A GGUF/llama.cpp workflow for old NVIDIA GPUs.
A local OpenAI-compatible endpoint helper.
A Gateway Dashboard preview.

Kimari today is not:

A new inference engine replacing llama.cpp.
A public Kimari-4B model.
A production server.
A benchmark leaderboard.

See docs/PROJECT_TRUTH.md for the full honesty document.

Kimari-4B Training Status (May 2026)

We are iterating on a custom model to power Kimari. The focus is benchmark honesty: the model must not fabricate scores or metrics.

Phase	Status	Result
V2 SFT (SmolLM3-3B)	✅	1 safety + 2 factual regressions
V3 SFT (+12 corrective)	✅	es-tech-003 fixed, refuse-010 persisted
V4 SFT (+6 aggressive)	✅	Aggregate improved, benchmark honesty still failed
A/B: Qwen3 vs SmolLM3	✅	Qwen3 wins (better honesty + tech quality)
DPO Pilot 15% (Qwen3)	✅	3G/0R/7Y — 0 benchmark fabrications
DPO Full 420 (Qwen3)	❌	Regressed (1 fabrication)
Expanded eval (180 prompts)	⏳	Candidate V5 pending

Current best: Pilot15 adapter (Smouj013/kimari-v5-qwen3-dpo-pilot15-adapter, private). Gate: BLOCKED — must reach 0 fabrications in expanded evaluation.

No public weights, adapters, GGUF files or benchmark claims at this stage.

Why Kimari?

Older GPU support — Designed specifically for GTX 1060 and GTX 1080, not just the latest cards.
Zero cloud dependency — Everything runs locally. No subscriptions, no API keys, no telemetry.
OpenAI-compatible — Drop-in endpoint for existing tools and integrations.
CLI-first — One command to install, one command to start, one command to diagnose.
Honest status — No inflated benchmarks, no "coming soon" claims. Alpha means alpha.

Current Status

Area	Status
Framework / CLI	✅ Usable alpha
Local GGUF runtime	✅ Working
OpenAI-compatible endpoint	✅ Working
GTX 1060 validation	✅ Validated with TinyLlama test model
Gateway Dashboard	✅ Local preview
Open WebUI / Continue / OpenClaw configs	✅ Documented
Kimari SFT/private adapter work	🔒 Private only
Public Kimari-4B weights	❌ Not released
Public GGUF Kimari model	❌ Not released
Release gate	🔒 BLOCKED

Kimari is the framework. Kimari-4B is not released yet.

No public Kimari weights, adapters or GGUF files are available at this stage.

Quick Start

One-command install

curl -fsSL https://raw.githubusercontent.com/smouj/kimari-local-ai/main/install.sh | bash

Or the secure alternative (recommended for review):

curl -fsSLO https://raw.githubusercontent.com/smouj/kimari-local-ai/main/install.sh
less install.sh
bash install.sh --dry-run
bash install.sh --with-test-model --yes

Then open the guided console:

kimari console

Manual install

git clone https://github.com/smouj/kimari-local-ai.git
cd kimari-local-ai
pip install -e .
kimari doctor --deep

Download the test model

kimari pull test

Start the local API

kimari start

Local endpoint: http://127.0.0.1:11435/v1

Open the Gateway Dashboard

kimari gateway setup
kimari gateway start --open

The dashboard is controlled through the Kimari CLI. Users should not need to run npm manually.

Screenshots

Dashboard Overview (Light)	Dashboard Overview (Dark)

Gateway Dashboard at http://127.0.0.1:3105 — monochrome design, dark/light mode. Start with kimari gateway start --open.

Gateway Dashboard

Kimari includes a local dashboard for monitoring and managing your local AI environment.

kimari gateway start --open

The dashboard shows:

local runtime status;
GPU and VRAM information;
model status;
profiles;
integrations;
logs;
experimental chat playground;
release gate status.

Security defaults:

Setting	Default
Host	`127.0.0.1`
Public bind	Disabled
Mode	Local preview
Gate	`BLOCKED`
Tokens in UI	No
Public model upload	No

See docs/GATEWAY_DASHBOARD.md for details.

Local Runtime Validation

Kimari has been tested on a real NVIDIA GTX 1060 6GB under WSL2 with llama-server CUDA.

Metric	CUDA GTX 1060	CPU-only
Prompt processing	228 tok/s	77 tok/s
Token generation	73 tok/s	33 tok/s
Model VRAM	1221 MiB	—

Detail	Value
GPU	NVIDIA GeForce GTX 1060 6GB
OS	WSL2 Ubuntu 24.04
Backend	llama-server CUDA
Test model	TinyLlama 1.1B Q4_K_M
Kimari-4B	Not released

This validation uses TinyLlama as a test model. It is not a Kimari-4B benchmark.

See docs/GTX1060_SHOWCASE.md and docs/GTX1060_LOCAL_RUNTIME_RESULT.md.

What Works Today

Feature	Command / Docs
Diagnostics	`kimari doctor --deep`
Guided setup	`kimari setup --write --yes`
Test model download	`kimari pull test`
Local API server	`kimari start`
Chat via CLI	`kimari chat "hello"`
Gateway dashboard	`kimari gateway start --open`
Open WebUI config	`kimari integrations generate --target openwebui`
Continue.dev config	`kimari integrations generate --target continue`
OpenClaw config	`kimari integrations generate --target openclaw`
Model verification	`kimari models hash` / `kimari models verify`
Update check	`kimari update check --online`
Performance tuning	`kimari optimize` / `kimari perf`
Benchmark	`kimari benchmark --dry-run`

Full CLI reference: docs/CLI_REFERENCE.md

Model Status

Kimari currently runs compatible local GGUF models. Official Kimari models are still private or planned.

Model line	Status
TinyLlama test profile	✅ Available for validation
Kimari Runtime 1.5B	🔒 Private experiments
Kimari Core 3B	🔒 Private experiments
Kimari-4B	❌ Not released
Official Kimari GGUF	❌ Not released

Kimari follows an open-license model policy. Official public releases must use permissive-compatible base models and pass private evaluation, manual review and local GGUF validation.

See docs/KIMARI4B_RELEASE_GATE.md, docs/KIMARI_OPEN_LICENSE_POLICY.md, docs/KIMARI_BASE_MODEL_LICENSE_MATRIX.md.

Training and Evaluation Status

Kimari-4B model training is active. The focus is benchmark honesty: 0 fabricated benchmark claims.

Phase	Date	Result
V2 SFT (SmolLM3-3B)	May 2026	1 safety + 2 factual regressions
V3 SFT (+12 corrective)	May 2026	Partial improvement
V4 SFT (+6 aggressive)	May 2026	es-tech-003 fixed, honesty persisted
A/B: Qwen3 vs SmolLM3	May 2026	Qwen3 selected
DPO Pilot 15% (Qwen3)	May 2026	3G/0R/7Y — 0 fabrications
DPO Full 420 (Qwen3)	May 2026	Regressed (1 fabrication)
Expanded eval (180 prompts)	May 2026	⏳ Candidate V5 pending

Best checkpoint: Pilot15 (Smouj013/kimari-v5-qwen3-dpo-pilot15-adapter, private). Gate: BLOCKED

See:

Run agents today with public GGUF models

Kimari-4B is not public yet. For real local agent workflows today, use:

agent-qwen1060 — GTX 1060 6GB, Qwen3-4B Q4_K_M, 4K context
agent-qwen1080 — GTX 1080 8GB, Qwen3-4B Q4_K_M, 8K context
agent-smollm1060 — safer 3B fallback for GTX 1060

kimari pull recommended
kimari start --profile agent-qwen1060

See docs/RUN_AGENTS_NOW.md.

GPU Profiles

Profile	GPU	VRAM	Quantization	Context	Status
`test`	Any 6 GB+	6 GB	Q4_K_M	4,096	Default (alpha)
`agent-qwen1060`	GTX 1060	6 GB	Q4_K_M	4,096	✅ Public model
`agent-qwen1080`	GTX 1080	8 GB	Q4_K_M	8,192	✅ Public model
`agent-smollm1060`	GTX 1060	6 GB	Q4_K_M	4,096	✅ Public model (low VRAM)
`gtx1060`	GTX 1060	6 GB	Q4_K_M	8,192	Requires Kimari-4B
`gtx1080`	GTX 1080	8 GB	Q5_K_M	16,384	Requires Kimari-4B
`turbo`	6 GB+	6 GB	IQ4_XS	8,192	Requires Kimari-4B

The test profile is the default during alpha, using TinyLlama 1.1B. For real agent work, use agent-qwen1060 or agent-smollm1060 with public GGUF models. When Kimari-4B is published, gtx1060 will become the new default.

Documentation

Getting started

Topic	Link
Install guide	docs/INSTALL_ONE_COMMAND.md
Console guide	docs/KIMARI_CONSOLE.md
Gateway dashboard	docs/GATEWAY_DASHBOARD.md
CLI reference	docs/CLI_REFERENCE.md
Local endpoint test	docs/LOCAL_OPENAI_ENDPOINT_TEST.md
Run agents now	docs/RUN_AGENTS_NOW.md

Integrations

Tool	Link
Open WebUI	docs/OPENWEBUI_LOCAL_SETUP.md
OpenClaw	docs/OPENCLAW_LOCAL_SETUP.md
Continue.dev	docs/CONTINUE_LOCAL_SETUP.md
Hermes	docs/OPENWEBUI_OPENCLAW_QUICK_CONFIG.md

Model and training policy

Topic	Link
Release gate	docs/KIMARI4B_RELEASE_GATE.md
Open-license policy	docs/KIMARI_OPEN_LICENSE_POLICY.md
Dataset policy	docs/KIMARI_SFT_V1_DATASET.md
Eval plan	docs/KIMARI_EVAL_PRIVATE_V1.md
Training history	docs/KIMARI4B_RUN_HISTORY.md
Training plan	docs/MODEL_TRAINING_PLAN.md

Advanced

Topic	Link
Performance tuning	docs/PERFORMANCE_TUNING_PLAN.md
Model hashing	docs/MODEL_HASHING.md
Benchmarks	docs/MEASURED_BENCHMARKS.md
Experimental API	docs/API_EXPERIMENTAL.md
PyPI release gate	docs/PYPI_RELEASE_GATE.md
Architecture	docs/00-03_architecture.md
Security	SECURITY.md
Privacy	PRIVACY.md

Hugging Face

Resource	Link
Kimari organization	https://huggingface.co/kimari-ai
Kimari Fit Lab	https://huggingface.co/spaces/kimari-ai/kimari-fit-lab
Reference GGUF collection	https://huggingface.co/collections/Smouj013/kimari-compatible-gguf-models-6a0352c75d2bfeff34d51e66

The Hugging Face Space is a compatibility/demo tool. It does not run Kimari-4B. The collection contains reference/community GGUF models, not official Kimari models.

Roadmap

Stage	Goal
✅ Current	Local runtime + Gateway Dashboard + GitHub Pages landing page + Console + Installer
Next	Private adapter runtime preview
Next	Agent Gateway tools and web-search dry-run
Next	Manual review of private outputs
Later	Private GGUF export
Later	Public preview decision

See ROADMAP.md for the full roadmap.

Safety

Kimari is local-first and conservative by default:

localhost-only defaults;
no public bind unless explicitly requested;
no token storage in dashboard;
no public model upload;
no public GGUF;
no benchmark claims without reproducible validation;
no automatic gate transitions.

Community

License

Kimari Local AI is released under the MIT License. See LICENSE.

Model weights are not included in this repository. See MODEL_LICENSES.md for model licensing information.

Made by Smouj · GitHub · Website

Name		Name	Last commit message	Last commit date
Latest commit History 249 Commits
.github		.github
agent-ctx		agent-ctx
apps/gateway-dashboard		apps/gateway-dashboard
benchmarks		benchmarks
cli		cli
config		config
data/dpo		data/dpo
dataset		dataset
docker		docker
docs		docs
eval		eval
huggingface		huggingface
ide/continue		ide/continue
kimari		kimari
knowledge/2026		knowledge/2026
models		models
packages/kimari-cli		packages/kimari-cli
public		public
releases		releases
reports		reports
scripts		scripts
server		server
skills		skills
src		src
systemd		systemd
tests		tests
tools		tools
training		training
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
GETTING_STARTED.md		GETTING_STARTED.md
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
MAINTAINERS.md		MAINTAINERS.md
MANIFEST.in		MANIFEST.in
MODEL_CARD.md		MODEL_CARD.md
MODEL_LICENSES.md		MODEL_LICENSES.md
Makefile		Makefile
PRIVACY.md		PRIVACY.md
README.md		README.md
RELEASE_CHECKLIST.md		RELEASE_CHECKLIST.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
components.json		components.json
eslint.config.mjs		eslint.config.mjs
install.ps1		install.ps1
install.sh		install.sh
next.config.ts		next.config.ts
postcss.config.mjs		postcss.config.mjs
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json
worklog.md		worklog.md

Folders and files

Latest commit

History

Repository files navigation

Kimari Local AI

What is Kimari?

Current Truth

Kimari-4B Training Status (May 2026)

Why Kimari?

Current Status

Quick Start

One-command install

Manual install

Download the test model

Start the local API

Open the Gateway Dashboard

Screenshots

Gateway Dashboard

Local Runtime Validation

What Works Today

Model Status

Training and Evaluation Status

Run agents today with public GGUF models

GPU Profiles

Documentation

Getting started

Integrations

Model and training policy

Advanced

Hugging Face

Roadmap

Safety

Community

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages