Home

EMBODIOS Wiki

Bare-Metal AI Operating System Current Status: 95% complete (AI Runtime: 100%) Blocking: Native hardware testing only Last Updated: 29 January 2026

Welcome

EMBODIOS is the world's first bare-metal AI operating system - where the AI model runs directly on hardware without a traditional OS layer. No userspace. No OS overhead. Just transformers and hardware.

What started as a Friday night experiment has evolved into a production roadmap demonstrating that kernel-space AI inference is not only possible, but offers significant advantages for specific use cases.

Project Status

Component	Status	Completion
Kernel Foundation	Memory, boot, interrupts, DMA, scheduler	95% ✅
AI Runtime	GGUF, BPE, streaming inference, quantization	100% ✅
Drivers	NVMe, VirtIO, e1000e, PCI, TCP/IP, Industrial	85% ✅
Performance	SIMD, parallel inference, benchmarks	90% ✅
Documentation	Wiki, README, Contributing guide	100% ✅
Overall	Ready for v1.0 - hardware testing only	95%

Recent Achievements (January 2026)

✅ Interactive Chat Mode: talk command for dedicated conversation sessions
✅ Performance Stats: Separate perf command for timing metrics
✅ Console UX: Polished help system, status display, typo suggestions
✅ Production ISO Builder: scripts/create_iso.sh with GRUB boot menu
✅ Streaming Inference: Memory-efficient with parallel workers
✅ Q4_K NEON SIMD: Fused matmul for ARM64
✅ Stability Tests: 1h-72h automated long-running tests
✅ GGUF Parser: Full support for TinyLlama, Phi-2, SmolLM, Mistral-7B
✅ BPE Tokenizer: Proper tokenization from GGUF vocabulary
✅ All Quantization Types: Q4_K, Q5_K, Q6_K, Q8_0

Wiki Navigation

Overview & Planning:

Executive-Summary - Vision, targets, and roadmap overview
Current-State-Analysis - Detailed status, what works, what's missing
Development-Roadmap - Three-phase plan to v1.0

Architecture & Strategy:

Three-Strategic-Pillars - Core development tracks
Pillar-1:-Ollama-GGUF-Integration - Industry-standard model support (85% COMPLETE)
Pillar-2:-Linux-Driver-Compatibility - Reuse existing drivers
Pillar-3:-Performance-Optimization - 85+ tokens/sec target

Integration Guides:

llama.cpp-Integration-Roadmap - Core transformer implementation
EMBODIOS---exo-Integration-Architecture - Distributed inference (post-v1.0)

v1.0 Goals

Functionality:

✅ Load TinyLlama-1.1B Q4_K_M using GGUF format
✅ Generate coherent text
✅ Switch between models dynamically
✅ Interactive chat mode (talk command)
⏳ Boot on real hardware (Intel NUC) - blocking item

Performance:

⏳ 85+ tokens/sec inference - needs native hardware
✅ <20ms first token latency
✅ ±0.5ms latency jitter (10x better than userspace)
✅ <1 second boot time

Deliverables:

✅ Production ISO with manifest system (scripts/create_iso.sh)
✅ Complete documentation in GitHub Wiki
✅ Contributing guide with code style
✅ Console Commands reference
✅ Example models (SmolLM, TinyLlama, Phi-2, Mistral-7B)
✅ Works in QEMU

Performance Targets

Metric	llama.cpp	EMBODIOS v1.0	Status
Speed	83-86 tok/s	85-95 tok/s	⚠️ Needs benchmarking
Memory	160 MB	120 MB	✅ 25% less
Latency	±5-10ms	±0.5ms	✅ 10x better
Boot	N/A	<1 sec	✅ Instant
First token	~50ms	<20ms	✅ 2.5x faster

Technology Stack

Core Technologies:

Kernel: x86_64 and ARM64 multiboot2
AI Format: GGUF (Ollama-compatible)
Reference: llama.cpp (transformer implementation)
Drivers: Linux compatibility layer
Optimization: SIMD (SSE2, AVX2), integer-only math

Development Tools:

GRUB 2.x (bootloader)
QEMU (testing)
GCC/Clang (compilation)
Docker (reproducible builds)

Key Design Principles

No Userspace: AI model runs in kernel mode with direct hardware access
Zero-Copy: Identity-mapped memory eliminates DMA overhead
Reuse, Don't Rewrite: Linux driver compatibility layer
Industry Standard: GGUF format for Ollama ecosystem compatibility
Performance First: SIMD throughout, cache-optimized data structures

Development Phases

Phase 1: Foundation ✅ COMPLETE

GGUF parser with metadata extraction
Core Linux compatibility shim
Quick performance wins (KV cache, pre-computed embeddings)

Phase 2: AI Runtime ✅ 85% COMPLETE

Multi-model support (load/switch/unload)
BPE tokenizer from GGUF
Transformer inference

Phase 3: Drivers & Polish ⚠️ IN PROGRESS

VirtIO drivers (net, block)
NVMe driver for real hardware
85+ tokens/sec validation
Production ISO and documentation

See Development-Roadmap for complete breakdown.

Contributing

Full guide: Contributing - Code style, PR process, testing

Choose Your Pillar:

Kernel hacker? → Pillar-2:-Linux-Driver-Compatibility
AI researcher? → Pillar-1:-Ollama-GGUF-Integration
Performance engineer? → Pillar-3:-Performance-Optimization

Getting Started:

Read the Executive-Summary
Check Current-State-Analysis for what's done
Review Contributing guide
Pick a task from Development-Roadmap

Documentation

Getting Started:

Getting-Started - Quick start guide
Console-Commands - Complete command reference
API-Reference - API documentation
Modelfile-Reference - Model configuration
Hardware-Requirements - Supported hardware

Technical Deep Dives:

Architecture-Overview - System architecture
Architecture-Comparison - Comparison with other systems
Bare-Metal-Deployment - Deployment guide
Performance-Benchmarks - Benchmark results
Quantized-Integer-Inference - Integer-only AI inference
Project-Structure - Codebase layout

Quick Links

GitHub: EMBODIOS Repository
Models: GGUF format from Hugging Face/Ollama
Reference: llama.cpp
Testing: QEMU x86_64 and ARM64

Why This Matters

Kernel-space AI enables:

Ultra-low latency: 10x better consistency for real-time AI
Minimal footprint: 25% less memory for edge/embedded devices
Direct hardware access: No syscall overhead, zero-copy DMA
Security: Model isolation at kernel level

What started as "that's crazy" is now a production roadmap.

This underground, cyberpunk project is looking for contributors.

Last Updated: 29 January 2026 Project Status: 75% complete, AI runtime 90% done Next Milestone: Real hardware testing + Performance benchmarking vs llama.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

EMBODIOS Wiki

Welcome

Project Status

Recent Achievements (January 2026)

Wiki Navigation

v1.0 Goals

Performance Targets

Technology Stack

Key Design Principles

Development Phases

Contributing

Documentation

Quick Links

Why This Matters

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally