A local, fully functional developer assistant that brings spec-driven development directly into your IDE. Ask natural language questions about your technical specifications and perform automated code compliance checks directly from VS Code, Claude Desktop or any MCP-compatible client.
Developer Question / Code Snippet
│
▼
MCP Server (stdio transport)
┌─────────────────────────────┐
│ get_spec(query) │ ◄── Custom tools exposed to client
│ list_specs() │
│ validate_code(code, spec) │
└────────────┬────────────────┘
│
┌───────▼────────┐
│ RAG Pipeline │
│ │
│ 1. Embed query│──► sentence-transformers (local, all-MiniLM-L6-v2)
│ 2. Retrieve │──► ChromaDB (local vector DB, cosine metric)
│ 3. Build prompt│
│ 4. Call LLM │──► Ollama llama3.2 (local) or OpenAI
└────────────────┘
│
┌───────▼────────┐
│ specs/ folder │ ◄── Your Markdown (.md) or Text (.txt) files
└────────────────┘
- Python 3.11+ installed on your system.
- Ollama installed and running locally.
- Pull the default local model using:
ollama pull llama3.2
Clone this repository, navigate to the directory, and install dependencies:
pip install -r requirements.txtNote
The initial setup might take a moment to resolve as it downloads the local embedding model (all-MiniLM-L6-v2 ~90MB) on first run.
Copy the example environment configuration to create your local .env file:
# Windows
copy .env.example .env
# macOS/Linux
cp .env.example .envThe defaults are already pre-configured to work with a local Ollama server out of the box.
Put your .md or .txt specification documents inside the specs/ directory. By default, the repository contains:
auth_spec.md— Authentication, authorization rules, and endpoints.user_management_spec.md— User CRUD API specifications.notification_spec.md— In-app, webhook, and email notification settings.
Run the ingestion pipeline to parse documents, split them into chunks, compute vector embeddings, and save them to your local database:
python ingest.pyTo inspect exactly what text chunks, documents, and metadata are indexed in your local vector database, run the helper database viewer script:
python view_db.pyTo test the MCP tools interactively, you can run the server using the MCP developer tool:
mcp dev mcp_server/server.pyThis runs the server locally and launches a web-based MCP Inspector where you can invoke and test all tools in real-time.
Add the server configuration to your Claude Desktop configuration file:
- Windows:
%APPDATA%\Claude\claude_desktop_config.json - macOS:
~/Library/Application Support/Claude/claude_desktop_config.json
Add the following JSON configuration (replacing absolute paths with your own directory path):
{
"mcpServers": {
"spec-assistant": {
"command": "python",
"args": ["C:/absolute/path/to/spec-mcp-poc/mcp_server/server.py"],
"env": {}
}
}
}Restart Claude Desktop, and you will see the new tools symbol in the composer window!
Add this configuration block to your IDE extension configuration file:
{
"spec-assistant": {
"command": "python",
"args": ["mcp_server/server.py"],
"cwd": "C:/absolute/path/to/spec-mcp-poc"
}
}| Tool Name | Parameters | Description |
|---|---|---|
get_spec |
query (str) |
Ask natural language questions about specifications. Utilizes semantic vector search to augment your LLM's response. |
list_specs |
None | Returns a detailed list of all specifications currently parsed and indexed inside the database. |
validate_code |
code (str), spec_name (str) |
Validates code snippets against specifications and returns list of compliant features, violations, and recommendation checklist. |
spec-mcp-poc/
├── specs/ # 📄 Raw specification files (add yours here)
│ ├── auth_spec.md
│ ├── user_management_spec.md
│ └── notification_spec.md
├── ingestion/
│ ├── chunker.py # Loads specs and splits them into sliding-window text chunks
│ ├── embedder.py # Embeds chunks using sentence-transformers (local model)
│ └── indexer.py # Interfaces with ChromaDB (creation, deletes, insertions)
├── rag/
│ ├── retriever.py # Performs semantic search querying database via cosine distance
│ ├── prompt_builder.py # Generates LLM chat prompts for retrieval and compliance checks
│ └── llm_client.py # Routes requests to Ollama (local) or OpenAI (cloud)
├── mcp_server/
│ └── server.py # MCP Server exposing standard tools over stdio
├── config.py # Central configurations and environment reader
├── ingest.py # Entry point command line pipeline to run ingestion
├── view_db.py # Helper utility script to view local vector database entries
├── requirements.txt # Python dependencies
├── .gitignore # Git ignored patterns
├── .env # Local environment configurations (ignored)
└── .env.example # Template configuration
If you prefer using OpenAI cloud endpoints instead of local Ollama, update your .env file configuration:
LLM_BACKEND=openai
OPENAI_API_KEY=sk-proj-your-actual-api-key
OPENAI_MODEL=gpt-4o-miniNo code modifications are required; the system automatically switches backends on the fly.
| Problem | Potential Cause | Troubleshooting Action |
|---|---|---|
No indexed specs found |
Database has not been initialized. | Run python ingest.py to index specs. |
Connection refused (Ollama) |
Ollama service is not running. | Make sure the Ollama application is running, or run ollama serve. |
Model not found |
The model is missing in Ollama. | Run ollama pull llama3.2 to download the model. |
| Slow execution during first run | Cold start downloads. | The local embedding model (all-MiniLM-L6-v2) is downloaded once and cached for future runs. |