feat(mcp): tool-description tightening +5% P@1 + OpenRouter backend (v0.11.6)

sdsrss · claude · sdsrss · commit d6f1dc0dc62a · 2026-04-17T07:29:55.000+08:00
Routing-recall bench first run on v0.11.4: 18/20 = 90.0% (2 misses from
semantic overlap between adjacent tools). Tightened 4 descriptions leading
with a shape verb + explicit deflection to the adjacent tool. Rerun: 19/20
= 95.0% net +5.0pt. Remaining miss ("show me EmbeddingModel struct" →
ast_search) is borderline since ast_search with type=struct returns the
right answer.

Tool-description changes (src/mcp/tools.rs, all ≤200 chars):
- get_call_graph: leads with "Who calls X, what X calls" + "Returns a graph"
- find_references: "Flat enumeration of all usage sites" + "For 'who calls X?',
  use get_call_graph"
- get_ast_node: "Inspect ONE named symbol" + "you have a symbol name"
- ast_search: "Enumerate MULTIPLE symbols" + "For ONE known symbol, use
  get_ast_node"

Routing-bench OpenRouter backend (tests/routing_bench.rs):
Auto-detect ANTHROPIC_API_KEY or OPENROUTER_API_KEY. OpenAI-compat schema
conversion (tools -&gt; {type: function, function: {...}}). Model default
anthropic/claude-sonnet-4.5; override with ROUTING_BENCH_MODEL.

Baselines (tracked in feedback_routing_bench memory):
  v0.11.4  18/20 = 90.0%  openrouter/anthropic/claude-sonnet-4.5
  v0.11.6  19/20 = 95.0%  openrouter/anthropic/claude-sonnet-4.5

Co-Authored-By: Claude Opus 4.7 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
@@ -5,14 +5,14 @@
   },
   "metadata": {
     "description": "AST knowledge graph plugin for Claude Code — semantic search, call graph, HTTP tracing, impact analysis",
-    "version": "0.11.5"
+    "version": "0.11.6"
   },
   "plugins": [
     {
       "name": "code-graph-mcp",
       "source": "./claude-plugin",
       "description": "AST knowledge graph for intelligent code navigation — auto-indexes your codebase and provides semantic search, call graph traversal, HTTP route tracing, and impact analysis via MCP tools",
-      "version": "0.11.5",
+      "version": "0.11.6",
       "author": {
         "name": "sdsrs"
       },
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,53 @@
 # Changelog
 
+## v0.11.6 — Tool-description tightening (+5% routing P@1) + OpenRouter backend
+
+First run of the routing-recall benchmark landed v0.11.4 at **P@1 = 18/20 = 90.0%**
+(`anthropic/claude-sonnet-4.5` via OpenRouter). The two misses were both semantic
+overlaps between adjacent tools. This release tightens 4 tool descriptions and
+re-runs the bench: **P@1 = 19/20 = 95.0%**, a net +5.0 points with one miss
+remaining (borderline — "show me the EmbeddingModel struct" routes to `ast_search`
+with `type=struct`, which returns the right answer albeit via the "enumerate"
+tool rather than the "inspect ONE" tool).
+
+### Tool-description changes (`src/mcp/tools.rs`)
+
+All stay under the 200-char registry limit.
+
+- **`get_call_graph`** — leads with `"Who calls X, what X calls"` + `"Returns a
+  graph (not a flat list)"`. Fixed routing for "Who calls ensure_indexed?"
+  (was → `find_references`, now → `get_call_graph`).
+- **`find_references`** — leads with `"Flat enumeration of all usage sites"` +
+  explicit deflection: `"For 'who calls X?', use get_call_graph."`.
+- **`get_ast_node`** — leads with `"Inspect ONE named symbol"` + `"you have a
+  symbol name (or node_id) and want its definition/body"` to claim the
+  "show me X / signature of Y" intent.
+- **`ast_search`** — leads with `"Enumerate MULTIPLE symbols by structural
+  criteria"` + deflection: `"For ONE known symbol, use get_ast_node."`.
+
+Pattern: each description now leads with a shape verb (`who calls`, `flat
+enumeration`, `inspect ONE`, `enumerate MULTIPLE`) and points at the
+adjacent tool when a query drifts into overlap.
+
+### Routing-bench OpenRouter backend (`tests/routing_bench.rs`)
+
+Auto-detects `ANTHROPIC_API_KEY` (native Messages API) or `OPENROUTER_API_KEY`
+(OpenAI-compatible `/chat/completions`). Tool schemas re-packaged as
+`{type: "function", function: {...}}` for the OpenRouter path. Model default
+`anthropic/claude-sonnet-4.5`; override with `ROUTING_BENCH_MODEL`. Anthropic
+wins if both keys present.
+
+### Baseline measurement (published)
+
+| Run | Backend / Model | P@1 |
+|-----|-----------------|-----|
+| v0.11.4 baseline | openrouter / anthropic/claude-sonnet-4.5 | 18/20 (90.0%) |
+| v0.11.6 post-tightening | openrouter / anthropic/claude-sonnet-4.5 | 19/20 (95.0%) |
+
+Cost ≈ $0.10/run. Threshold stays at 0.70; consider raising to 0.85 after two
+more releases confirm 95% as stable baseline (20-query sample is within model
+stochasticity range).
+
 ## v0.11.5 — Hotfix: clippy 1.95 parity (`unnecessary_sort_by`)
 
 `-D warnings` on stable clippy 1.95 flagged the two `sort_by(|a, b| b.0.cmp(&a.0))`
diff --git a/Cargo.lock b/Cargo.lock
diff --git a/Cargo.toml b/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "code-graph-mcp"
-version = "0.11.5"
+version = "0.11.6"
 edition = "2021"
 
 [features]
diff --git a/claude-plugin/.claude-plugin/plugin.json b/claude-plugin/.claude-plugin/plugin.json
@@ -4,7 +4,7 @@
   "author": {
     "name": "sdsrs"
   },
-  "version": "0.11.5",
+  "version": "0.11.6",
   "keywords": [
     "code-graph",
     "ast",
diff --git a/npm/darwin-arm64/package.json b/npm/darwin-arm64/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@sdsrs/code-graph-darwin-arm64",
-  "version": "0.11.5",
+  "version": "0.11.6",
   "description": "code-graph-mcp binary for macOS ARM64",
   "license": "MIT",
   "repository": {
diff --git a/npm/darwin-x64/package.json b/npm/darwin-x64/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@sdsrs/code-graph-darwin-x64",
-  "version": "0.11.5",
+  "version": "0.11.6",
   "description": "code-graph-mcp binary for macOS x64",
   "license": "MIT",
   "repository": {
diff --git a/npm/linux-arm64/package.json b/npm/linux-arm64/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@sdsrs/code-graph-linux-arm64",
-  "version": "0.11.5",
+  "version": "0.11.6",
   "description": "code-graph-mcp binary for Linux ARM64",
   "license": "MIT",
   "repository": {
diff --git a/npm/linux-x64/package.json b/npm/linux-x64/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@sdsrs/code-graph-linux-x64",
-  "version": "0.11.5",
+  "version": "0.11.6",
   "description": "code-graph-mcp binary for Linux x64",
   "license": "MIT",
   "repository": {
diff --git a/npm/win32-x64/package.json b/npm/win32-x64/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@sdsrs/code-graph-win32-x64",
-  "version": "0.11.5",
+  "version": "0.11.6",
   "description": "code-graph-mcp binary for Windows x64",
   "license": "MIT",
   "repository": {
diff --git a/package.json b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@sdsrs/code-graph",
-  "version": "0.11.5",
+  "version": "0.11.6",
   "description": "MCP server that indexes codebases into an AST knowledge graph with semantic search, call graph traversal, and HTTP route tracing",
   "license": "MIT",
   "repository": {
@@ -34,10 +34,10 @@
     "node": ">=16"
   },
   "optionalDependencies": {
-    "@sdsrs/code-graph-linux-x64": "0.11.5",
-    "@sdsrs/code-graph-linux-arm64": "0.11.5",
-    "@sdsrs/code-graph-darwin-x64": "0.11.5",
-    "@sdsrs/code-graph-darwin-arm64": "0.11.5",
-    "@sdsrs/code-graph-win32-x64": "0.11.5"
+    "@sdsrs/code-graph-linux-x64": "0.11.6",
+    "@sdsrs/code-graph-linux-arm64": "0.11.6",
+    "@sdsrs/code-graph-darwin-x64": "0.11.6",
+    "@sdsrs/code-graph-darwin-arm64": "0.11.6",
+    "@sdsrs/code-graph-win32-x64": "0.11.6"
   }
 }
diff --git a/src/mcp/tools.rs b/src/mcp/tools.rs
@@ -48,7 +48,7 @@ impl ToolRegistry {
             },
             ToolDefinition {
                 name: "get_call_graph".into(),
-                description: "Call chain for a function. Use when: tracing who calls it / what it calls, understanding flow before modifying. Recursive with depth tracking.".into(),
+                description: "Who calls X, what X calls: multi-hop call-chain for a named function. Use when: 'who calls X?' or 'what does X call?' or tracing flow with depth. Returns a graph.".into(),
                 input_schema: json!({
                     "type": "object",
                     "properties": {
@@ -64,7 +64,7 @@ impl ToolRegistry {
             },
             ToolDefinition {
                 name: "get_ast_node".into(),
-                description: "Get symbol details: type, signature, code, references, impact. Use when: inspecting a function/class before editing it. Accepts symbol_name, node_id, or file_path+symbol_name.".into(),
+                description: "Inspect ONE named symbol: signature, full source, optional references/impact. Use when: you have a symbol name (or node_id) and want its definition/body.".into(),
                 input_schema: json!({
                     "type": "object",
                     "properties": {
@@ -105,7 +105,7 @@ impl ToolRegistry {
             },
             ToolDefinition {
                 name: "ast_search".into(),
-                description: "Structural code search by type/return/params. Use when: finding all functions returning a type, or querying code structure that grep can't express.".into(),
+                description: "Enumerate MULTIPLE symbols by structural criteria (type, return, params). Use when: 'all structs in module X' or 'all fns returning Vec<T>'. For ONE known symbol, use get_ast_node.".into(),
                 input_schema: json!({
                     "type": "object",
                     "properties": {
@@ -120,7 +120,7 @@ impl ToolRegistry {
             },
             ToolDefinition {
                 name: "find_references".into(),
-                description: "All references to a symbol. Use when: checking if safe to rename/remove, or finding all usage points before refactoring. Shows callers, importers, inheritors.".into(),
+                description: "Flat enumeration of all usage sites (calls/imports/inherits/implements). Use when: auditing every place a symbol is touched before rename/remove. For 'who calls X?', use get_call_graph.".into(),
                 input_schema: json!({
                     "type": "object",
                     "properties": {
diff --git a/tests/routing_bench.rs b/tests/routing_bench.rs

Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,6 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "@sdsrs/code-graph-darwin-arm64",`
`3`		`- "version": "0.11.5",`
	`3`	`+ "version": "0.11.6",`
`4`	`4`	`"description": "code-graph-mcp binary for macOS ARM64",`
`5`	`5`	`"license": "MIT",`
`6`	`6`	`"repository": {`
Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,6 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "@sdsrs/code-graph-darwin-x64",`
`3`		`- "version": "0.11.5",`
	`3`	`+ "version": "0.11.6",`
`4`	`4`	`"description": "code-graph-mcp binary for macOS x64",`
`5`	`5`	`"license": "MIT",`
`6`	`6`	`"repository": {`
Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,6 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "@sdsrs/code-graph-linux-arm64",`
`3`		`- "version": "0.11.5",`
	`3`	`+ "version": "0.11.6",`
`4`	`4`	`"description": "code-graph-mcp binary for Linux ARM64",`
`5`	`5`	`"license": "MIT",`
`6`	`6`	`"repository": {`
Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,6 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "@sdsrs/code-graph-linux-x64",`
`3`		`- "version": "0.11.5",`
	`3`	`+ "version": "0.11.6",`
`4`	`4`	`"description": "code-graph-mcp binary for Linux x64",`
`5`	`5`	`"license": "MIT",`
`6`	`6`	`"repository": {`
Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,6 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "@sdsrs/code-graph-win32-x64",`
`3`		`- "version": "0.11.5",`
	`3`	`+ "version": "0.11.6",`
`4`	`4`	`"description": "code-graph-mcp binary for Windows x64",`
`5`	`5`	`"license": "MIT",`
`6`	`6`	`"repository": {`