Please see the newer: https://github.com/Quazmoz/openvino-windows-llm
-
Updated
May 20, 2026 - Python
Please see the newer: https://github.com/Quazmoz/openvino-windows-llm
A lightweight LiteLLM server boilerplate pre-configured with uv and Docker for hosting your own OpenAI- and Anthropic-compatible endpoints. Includes LibreChat as an optional web UI.
single-executable / library which combines llama.cpp, whisper.cpp, and stable-diffusion.cpp
Function-calling API for LLM from multiple providers
macOS GUI for managing pure mlx_lm.server on Apple Silicon in Direct Mode.
A complete, menu-driven AI model interface for Windows that simplifies running local GGUF language models with llama.cpp. This tool automatically manages dependencies, provides multiple interaction modes, and prioritizes user privacy through fully offline operation.
Windows-first OpenAI-compatible local LLM server powered by OpenVINO GenAI for Intel CPU/GPU/NPU, with chat UI, model conversion, and setup scripts.
API server for `llm` CLI tool
A flexible FastAPI-based framework for handling AI tasks using Large Language Models (LLMs). Supports multiple providers, extensible tasks and routers, Redis caching, and OpenAI integration. Easily scalable for various LLM-based applications.
OpenAI-compatible local inference server for Apple Silicon using MLX. FastAPI server with Chat Completions and Responses APIs, multi-turn conversations, and streaming support.
PHP Frontend for Hosting local LLM's (run via VSCode or basic php execution methods/ add to project)
Headless CLI for managing local MLX language-model HTTP servers on Apple Silicon Macs. Supports model discovery, server lifecycle management, performance benchmarking, and provider integration with OpenCode, Claude Code, and LiteLLM.
Unified simple LLM server wrapper with intelligent routing based on model ID
Host an LLM and make it accessible on a network via API.
Run local AI models in VS Code with automatic model detection, server start, and built-in MCP endpoint—no cloud or manual setup required.
Add a description, image, and links to the llm-server topic page so that developers can more easily learn about it.
To associate your repository with the llm-server topic, visit your repo's landing page and select "manage topics."