From 7a6bc856e5b64ed15671b07dfd7b41f28c082003 Mon Sep 17 00:00:00 2001
From: Antoine Zambelli <antoine.zambelli@gmail.com>
Date: Tue, 2 Jun 2026 10:41:42 -0500
Subject: [PATCH 01/14] feat(reasoning): add reasoning_replay knob
 (full/keep-last/none)

Bound reasoning accumulation in the forge->backend direction. Adds
core/reasoning.py policy module and threads reasoning_replay through
inference, runner, and proxy convert/handler paths. keep-last emits
reasoning via reasoning_content for round-trip re-capture and trims
older reasoning; none strips it; full preserves prior behavior. The
Anthropic path drops reasoning under keep-last (no signable channel).
Includes docs (README, BACKEND_SETUP, USER_GUIDE) and unit tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 README.md                                  |   2 +-
 docs/BACKEND_SETUP.md                      |   2 +-
 docs/USER_GUIDE.md                         |   4 +
 src/forge/__init__.py                      |  12 +-
 src/forge/core/inference.py                |  65 ++++++++--
 src/forge/core/reasoning.py                |  53 ++++++++
 src/forge/core/runner.py                   |   6 +
 src/forge/proxy/__main__.py                |   9 ++
 src/forge/proxy/convert.py                 |  44 +++++--
 src/forge/proxy/convert_anthropic.py       |   9 +-
 src/forge/proxy/handler.py                 |  43 +++++--
 src/forge/proxy/proxy.py                   |   6 +
 src/forge/proxy/server.py                  |   4 +
 tests/unit/test_inference_passthrough.py   |  63 +++++++++-
 tests/unit/test_proxy_convert.py           |  62 +++++++++-
 tests/unit/test_proxy_convert_anthropic.py |  18 ++-
 tests/unit/test_proxy_handler.py           |  37 +++++-
 tests/unit/test_reasoning_replay.py        | 136 +++++++++++++++++++++
 18 files changed, 537 insertions(+), 38 deletions(-)
 create mode 100644 src/forge/core/reasoning.py
 create mode 100644 tests/unit/test_reasoning_replay.py

diff --git a/README.md b/README.md
index e1ed4a6..e1b827a 100644
--- a/README.md
+++ b/README.md
@@ -128,7 +128,7 @@ For multi-step workflows, multi-turn conversations, and backend auto-management,
 
 Drop-in proxy that sits between any client and a local model server, speaking both the OpenAI chat-completions API and the Anthropic Messages API (`/v1/messages`). Point your client at the proxy (e.g. `http://localhost:8081/v1`) and forge applies its guardrails transparently — the client thinks it's talking to a smarter model.
 
-This is the path for **using forge with an existing harness** (opencode, Continue, aider, Cline, anything that speaks the OpenAI chat-completions schema — or Claude Code, which speaks the Anthropic Messages API). No Python rewrite.
+This is the path for **using forge with an existing harness** (opencode, Continue, aider, Cline, anything that speaks the OpenAI chat-completions schema — or Claude Code, which speaks the Anthropic Messages API). No Python rewrite. Reasoning replay defaults to `keep-last`, so Forge captures reasoning for observability and replays only the latest available reasoning block to the backend on later turns; use `--reasoning-replay full` for the historical replay-all behavior or `--reasoning-replay none` to keep captured reasoning out of backend-facing history.
 
 ```bash
 # External mode — you manage the backend, forge proxies it
diff --git a/docs/BACKEND_SETUP.md b/docs/BACKEND_SETUP.md
index 75e667d..26702d3 100644
--- a/docs/BACKEND_SETUP.md
+++ b/docs/BACKEND_SETUP.md
@@ -75,7 +75,7 @@ llamafile --server --nobrowser -m path/to/model.gguf --port 8080 -ngl 999
 
 `LlamafileClient` is **native-first**: `mode="native"` (the default) forwards tools via the backend's `tools` parameter and requires native function calling (llama.cpp with `--jinja`). For a backend without native FC, declare `mode="prompt"` to inject tool descriptions into the prompt and parse the JSON call back out. The capability is declared at construction and frozen — there is no runtime auto-detection. Native-first is the default because local-model FC support has matured into the more reliable path; prompt-injection stays fully supported as an explicit opt-in, but note that on more complex, multi-step interactions models tend to struggle to drive the prompt-injected protocol reliably, so reach for it only when the backend leaves no alternative.
 
-> **Proxy note:** the OpenAI-compatible proxy is **native-first**. By default (`--backend-capability native`) it forwards the client's tools verbatim to an FC-capable backend (llama.cpp with `--jinja`, vLLM, Ollama, Anthropic) — the recommended setup. For a non-FC llama.cpp/llamafile backend, opt into prompt-injection with `--backend-capability prompt` (strips tools into the prompt, parses the JSON call back; reuses the same prompt path as the WorkflowRunner). The choice is frozen at startup — there is no runtime auto-detect in the proxy. See ADR-012.
+> **Proxy note:** the OpenAI-compatible proxy is **native-first**. By default (`--backend-capability native`) it forwards the client's tools verbatim to an FC-capable backend (llama.cpp with `--jinja`, vLLM, Ollama, Anthropic) — the recommended setup. For a non-FC llama.cpp/llamafile backend, opt into prompt-injection with `--backend-capability prompt` (strips tools into the prompt, parses the JSON call back; reuses the same prompt path as the WorkflowRunner). The choice is frozen at startup — there is no runtime auto-detect in the proxy. Reasoning replay is controlled separately with `--reasoning-replay {full,keep-last,none}`; the default `keep-last` replays only the latest captured reasoning block to the backend when that reasoning is available in the conversation history. See ADR-012.
 
 Smoke-test:
 
diff --git a/docs/USER_GUIDE.md b/docs/USER_GUIDE.md
index ac04a1f..e8ec90f 100644
--- a/docs/USER_GUIDE.md
+++ b/docs/USER_GUIDE.md
@@ -85,6 +85,8 @@ claude
 
 **Function-calling capability.** `--backend-capability native` (default) uses the backend's chat-template tool-calling and is the smoother default for Claude Code's heavy multi-turn tool use. `--backend-capability prompt` injects the tool surface into the prompt for llama.cpp/llamafile backends without a tool-calling template; whether a model stays coherent across multi-turn tool results in prompt mode varies by model — and tends to degrade on more complex, multi-step interactions — so prefer native whenever the backend supports it. The capability is declared at startup and frozen.
 
+**Reasoning replay.** Reasoning-capable backends may return hidden reasoning alongside tool calls. Forge captures that reasoning for observability, then controls how much is replayed to the backend on later turns with `--reasoning-replay {full,keep-last,none}`. The default is `keep-last`: only the latest captured reasoning block is replayed. `full` preserves the historical behavior and replays every captured reasoning block. `none` keeps reasoning out of backend-facing history. In OpenAI-compatible proxy responses, `keep-last` exposes current reasoning as `reasoning_content` instead of normal assistant `content` so clients that preserve reasoning fields can replay only the latest block without turning it into plain text. Anthropic proxy responses only emit reasoning text under `full`; Forge does not synthesize signed Anthropic thinking blocks, so default Anthropic proxy responses do not expose replayable reasoning.
+
 **Downstream protocol.**
 
 - **Local model (default, `--backend-protocol openai`)** — forge translates Claude Code's Anthropic requests to OpenAI for llama.cpp / Ollama and converts the reply back to Anthropic SSE. Anthropic-only fields with no OpenAI analog (`cache_control`, `thinking`, `document` blocks) are dropped at that boundary; see [ADR-015](decisions/015-cache-control-preservation-path1.md).
@@ -283,6 +285,8 @@ await server.stop()
 
 `WorkflowRunner` accepts an optional `on_message` callback that fires each time a `Message` is appended to the conversation during `run()`. This is the primary observability hook — use it for logging, eval metric collection, or building conversation history for multi-turn flows.
 
+`WorkflowRunner(reasoning_replay=...)` uses the same policy as the proxy: `keep-last` by default, `full` for the historical replay-all behavior, and `none` to avoid replaying captured reasoning to the backend. The policy affects backend-facing serialization only; `MessageType.REASONING` entries still appear in `on_message` and internal history unless context compaction removes them.
+
 - **Single-turn (default):** `on_message` fires for every message the runner creates — system prompt, user input, assistant responses, tool results, nudges.
 - **Multi-turn (`initial_messages`):** `run()` accepts an optional `initial_messages` parameter that seeds the conversation with prior history. `on_message` fires **only for new messages created during this turn**, not for the replayed history.
 
diff --git a/src/forge/__init__.py b/src/forge/__init__.py
index b6b8a27..7680419 100644
--- a/src/forge/__init__.py
+++ b/src/forge/__init__.py
@@ -17,7 +17,13 @@
     Workflow,
 )
 from forge.core.steps import StepTracker
-from forge.core.inference import InferenceResult, fold_and_serialize, run_inference
+from forge.core.inference import (
+    InferenceResult,
+    fold_and_serialize,
+    prepare_backend_messages,
+    run_inference,
+)
+from forge.core.reasoning import DEFAULT_REASONING_REPLAY, REASONING_REPLAY_CHOICES, ReasoningReplay
 from forge.core.runner import WorkflowRunner
 from forge.core.slot_worker import SlotWorker
 from forge.clients.base import ChunkType, LLMClient, StreamChunk, TokenUsage
@@ -87,7 +93,11 @@
     # Inference (front half — shared by runner and proxy)
     "InferenceResult",
     "fold_and_serialize",
+    "prepare_backend_messages",
     "run_inference",
+    "DEFAULT_REASONING_REPLAY",
+    "REASONING_REPLAY_CHOICES",
+    "ReasoningReplay",
     # Runner
     "WorkflowRunner",
     # Slot worker
diff --git a/src/forge/core/inference.py b/src/forge/core/inference.py
index 599b5e7..aa1d365 100644
--- a/src/forge/core/inference.py
+++ b/src/forge/core/inference.py
@@ -23,6 +23,12 @@
 )
 from forge.context.manager import ContextManager
 from forge.core.messages import Message, MessageMeta, MessageRole, MessageType, ToolCallInfo
+from forge.core.reasoning import (
+    DEFAULT_REASONING_REPLAY,
+    ReasoningReplay,
+    filter_openai_reasoning_messages,
+    validate_reasoning_replay,
+)
 from forge.core.workflow import LLMResponse, TextResponse, ToolCall, ToolSpec
 from forge.errors import StreamError, ToolCallError
 from forge.guardrails import ErrorTracker, ResponseValidator
@@ -77,19 +83,32 @@ class InferenceResult:
 def fold_and_serialize(
     messages: list[Message],
     api_format: str,
+    reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY,
 ) -> list[dict[str, Any]]:
     """Reasoning-fold and serialize forge Messages to API dicts.
 
-    Folds REASONING messages into the following TOOL_CALL message's content
-    field so the wire format has one assistant message with both content and
-    tool_calls (valid OpenAI format). Internal Message list stays separate
-    for compaction.
+    ``full`` folds every REASONING message into the following TOOL_CALL
+    message's content field, preserving the historical wire behavior.
+    ``keep-last`` folds only the most recent REASONING message in the
+    serialized history. ``none`` skips all REASONING messages on the wire.
+    Internal Message history stays separate for compaction and observability.
     """
+    reasoning_replay = validate_reasoning_replay(reasoning_replay)
     api_messages: list[dict[str, Any]] = []
     pending_reasoning: str | None = None
+    last_reasoning_index: int | None = None
+
+    if reasoning_replay == "keep-last":
+        for i, m in enumerate(messages):
+            if m.metadata.type == MessageType.REASONING and m.role == MessageRole.ASSISTANT:
+                last_reasoning_index = i
 
-    for m in messages:
+    for i, m in enumerate(messages):
         if m.metadata.type == MessageType.REASONING and m.role == MessageRole.ASSISTANT:
+            if reasoning_replay == "none":
+                continue
+            if reasoning_replay == "keep-last" and i != last_reasoning_index:
+                continue
             pending_reasoning = m.content
             continue
         d = m.to_api_dict(format=api_format)
@@ -107,6 +126,29 @@ def fold_and_serialize(
     return api_messages
 
 
+def prepare_backend_messages(
+    messages: list[Message],
+    api_format: str,
+    reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY,
+    raw_openai_messages: RawOpenAIMessages | None = None,
+    use_raw_messages: bool = False,
+) -> list[dict[str, Any]]:
+    """Prepare backend-facing messages from raw OpenAI or forge history.
+
+    This is the single backend replay-policy choke point. Raw OpenAI messages
+    preserve client-authored shape while filtering only reasoning fields; forge
+    history is folded with the same reasoning replay policy.
+    """
+    reasoning_replay = validate_reasoning_replay(reasoning_replay)
+    if use_raw_messages and raw_openai_messages is not None:
+        return filter_openai_reasoning_messages(
+            raw_openai_messages, reasoning_replay=reasoning_replay,
+        )
+    return fold_and_serialize(
+        messages, api_format, reasoning_replay=reasoning_replay,
+    )
+
+
 def _build_tool_call_infos(
     tool_calls: list[ToolCall],
     tool_call_counter: int,
@@ -138,6 +180,7 @@ async def run_inference(
     inbound_anthropic_body: dict[str, Any] | None = None,
     raw_openai_messages: RawOpenAIMessages | None = None,
     raw_openai_tools: RawOpenAITools | None = None,
+    reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY,
 ) -> InferenceResult | None:
     """Send messages to the LLM with compaction, folding, validation, and retry.
 
@@ -177,6 +220,7 @@ async def run_inference(
         ToolCallError: If retry budget (max_retries) is exhausted.
         StreamError: If streaming ends without a FINAL chunk.
     """
+    reasoning_replay = validate_reasoning_replay(reasoning_replay)
     api_format = getattr(client, "api_format", "ollama")
     new_messages: list[Message] = []
     max_retries = error_tracker.max_retries
@@ -219,10 +263,13 @@ async def run_inference(
             and compacted is messages
             and not context_warning
         )
-        if use_raw_messages:
-            api_messages = raw_openai_messages
-        else:
-            api_messages = fold_and_serialize(messages, api_format)
+        api_messages = prepare_backend_messages(
+            messages,
+            api_format,
+            reasoning_replay=reasoning_replay,
+            raw_openai_messages=raw_openai_messages,
+            use_raw_messages=use_raw_messages,
+        )
 
         # Inject context warning as transient user message (not persisted
         # in conversation history). Uses "user" role because mid-conversation
diff --git a/src/forge/core/reasoning.py b/src/forge/core/reasoning.py
new file mode 100644
index 0000000..df0acec
--- /dev/null
+++ b/src/forge/core/reasoning.py
@@ -0,0 +1,53 @@
+"""Reasoning replay policy shared by runner and proxy."""
+
+from __future__ import annotations
+
+from copy import deepcopy
+from typing import Any, Literal
+
+
+ReasoningReplay = Literal["full", "keep-last", "none"]
+REASONING_REPLAY_CHOICES: tuple[ReasoningReplay, ...] = ("full", "keep-last", "none")
+DEFAULT_REASONING_REPLAY: ReasoningReplay = "keep-last"
+
+
+def validate_reasoning_replay(value: str) -> ReasoningReplay:
+    """Validate and normalize a reasoning replay policy."""
+    if value not in REASONING_REPLAY_CHOICES:
+        choices = ", ".join(REASONING_REPLAY_CHOICES)
+        raise ValueError(f"reasoning_replay must be one of: {choices}")
+    return value  # type: ignore[return-value]
+
+
+REASONING_MESSAGE_FIELDS = ("reasoning_content", "reasoning", "reasoning_text")
+
+
+def filter_openai_reasoning_messages(
+    messages: list[dict[str, Any]],
+    reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY,
+) -> list[dict[str, Any]]:
+    """Copy raw OpenAI messages and apply the reasoning replay policy.
+
+    Non-reasoning fields are preserved verbatim so proxy passthrough keeps
+    client-authored extensions, multimodal blocks, names, and other metadata.
+    """
+    reasoning_replay = validate_reasoning_replay(reasoning_replay)
+    filtered = [deepcopy(msg) for msg in messages]
+    if reasoning_replay == "full":
+        return filtered
+
+    last_reasoning_index: int | None = None
+    if reasoning_replay == "keep-last":
+        for i, msg in enumerate(filtered):
+            if msg.get("role") == "assistant" and any(
+                msg.get(field) for field in REASONING_MESSAGE_FIELDS
+            ):
+                last_reasoning_index = i
+
+    for i, msg in enumerate(filtered):
+        if msg.get("role") != "assistant":
+            continue
+        if reasoning_replay == "none" or i != last_reasoning_index:
+            for field in REASONING_MESSAGE_FIELDS:
+                msg.pop(field, None)
+    return filtered
diff --git a/src/forge/core/runner.py b/src/forge/core/runner.py
index 79c2c0e..a880595 100644
--- a/src/forge/core/runner.py
+++ b/src/forge/core/runner.py
@@ -11,6 +11,7 @@
 from forge.context.manager import ContextManager
 from forge.core.inference import _NUDGE_KIND_TO_TYPE, _build_tool_call_infos, run_inference
 from forge.core.messages import Message, MessageMeta, MessageRole, MessageType, ToolCallInfo
+from forge.core.reasoning import DEFAULT_REASONING_REPLAY, ReasoningReplay, validate_reasoning_replay
 from forge.core.workflow import ToolCall, TextResponse, Workflow, ToolSpec
 from forge.errors import MaxIterationsError, PrerequisiteError, StepEnforcementError, ToolCallError, ToolExecutionError, ToolResolutionError, WorkflowCancelledError
 from forge.guardrails import ErrorTracker, ResponseValidator, StepEnforcer
@@ -42,6 +43,7 @@ def __init__(
         on_message: Callable[[Message], None] | None = None,
         rescue_enabled: bool = True,
         retry_nudge: Callable[[str], str] | str | None = None,
+        reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY,
     ):
         """
         Args:
@@ -65,6 +67,8 @@ def __init__(
             retry_nudge: Custom nudge for bare text responses. Pass a string
                 for a static message, or a callable ``(raw_response) -> str``
                 for dynamic nudges. If None, uses the default.
+            reasoning_replay: How much captured reasoning to replay to the
+                backend on later turns: ``full``, ``keep-last``, or ``none``.
         """
         self.client = client
         self.context_manager = context_manager
@@ -75,6 +79,7 @@ def __init__(
         self.on_chunk = on_chunk
         self.on_message = on_message
         self.rescue_enabled = rescue_enabled
+        self.reasoning_replay = validate_reasoning_replay(reasoning_replay)
         if isinstance(retry_nudge, str):
             self._retry_nudge_fn: Callable[[str], str] | None = lambda _raw, _msg=retry_nudge: _msg
         else:
@@ -180,6 +185,7 @@ def _emit(msg: Message) -> None:
                 max_attempts=self.max_iterations - iteration,
                 stream=self.stream,
                 on_chunk=self.on_chunk,
+                reasoning_replay=self.reasoning_replay,
             )
             # max_attempts exhausted — iteration budget spent
             if result is None:
diff --git a/src/forge/proxy/__main__.py b/src/forge/proxy/__main__.py
index 3b55ac9..d29f61a 100644
--- a/src/forge/proxy/__main__.py
+++ b/src/forge/proxy/__main__.py
@@ -8,6 +8,7 @@
 import sys
 import time
 
+from forge.core.reasoning import DEFAULT_REASONING_REPLAY, REASONING_REPLAY_CHOICES
 from forge.proxy.proxy import ProxyServer
 from forge.server import BudgetMode
 
@@ -85,6 +86,13 @@ def main() -> None:
         help="Inject forge's synthetic respond() tool when the client sends "
              "tools (keeps small models in tool-calling mode). Default off.",
     )
+    parser.add_argument(
+        "--reasoning-replay",
+        choices=REASONING_REPLAY_CHOICES,
+        default=DEFAULT_REASONING_REPLAY,
+        help="How much captured reasoning to replay to the backend "
+             "(default: keep-last).",
+    )
     parser.add_argument("--verbose", "-v", action="store_true", help="Verbose logging")
 
     args = parser.parse_args()
@@ -124,6 +132,7 @@ def main() -> None:
         inject_respond_tool=args.inject_respond_tool,
         backend_protocol=args.backend_protocol,
         backend_timeout=args.backend_timeout,
+        reasoning_replay=args.reasoning_replay,
     )
 
     def _shutdown(sig: int, _frame: object) -> None:
diff --git a/src/forge/proxy/convert.py b/src/forge/proxy/convert.py
index 33fbd5f..2af7819 100644
--- a/src/forge/proxy/convert.py
+++ b/src/forge/proxy/convert.py
@@ -7,6 +7,7 @@
 from typing import Any
 
 from forge.core.messages import Message, MessageMeta, MessageRole, MessageType, ToolCallInfo
+from forge.core.reasoning import DEFAULT_REASONING_REPLAY, ReasoningReplay, validate_reasoning_replay
 from forge.core.workflow import ToolCall, TextResponse
 
 
@@ -42,6 +43,17 @@ def openai_to_messages(openai_messages: list[dict[str, Any]]) -> list[Message]:
             ))
 
         elif role_str == "assistant":
+            reasoning = (
+                msg.get("reasoning_content")
+                or msg.get("reasoning")
+                or msg.get("reasoning_text")
+            )
+            if reasoning:
+                messages.append(Message(
+                    MessageRole.ASSISTANT,
+                    str(reasoning),
+                    MessageMeta(MessageType.REASONING),
+                ))
             if "tool_calls" in msg and msg["tool_calls"]:
                 tc_infos = []
                 for tc in msg["tool_calls"]:
@@ -61,7 +73,7 @@ def openai_to_messages(openai_messages: list[dict[str, Any]]) -> list[Message]:
                     MessageMeta(MessageType.TOOL_CALL),
                     tool_calls=tc_infos,
                 ))
-            else:
+            elif content:
                 messages.append(Message(
                     MessageRole.ASSISTANT,
                     content,
@@ -96,8 +108,10 @@ def tool_calls_to_openai(
     tool_calls: list[ToolCall],
     model: str = "forge",
     usage: Any | None = None,
+    reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY,
 ) -> dict[str, Any]:
     """Convert forge ToolCalls to an OpenAI chat completions response object."""
+    reasoning_replay = validate_reasoning_replay(reasoning_replay)
     tc_list = []
     for i, tc in enumerate(tool_calls):
         tc_list.append({
@@ -109,17 +123,22 @@ def tool_calls_to_openai(
             },
         })
 
+    reasoning = tool_calls[0].reasoning if tool_calls else None
+    message: dict[str, Any] = {
+        "role": "assistant",
+        "content": reasoning if reasoning_replay == "full" else None,
+        "tool_calls": tc_list,
+    }
+    if reasoning and reasoning_replay == "keep-last":
+        message["reasoning_content"] = reasoning
+
     response = {
         "id": f"chatcmpl-{uuid.uuid4().hex[:12]}",
         "object": "chat.completion",
         "model": model,
         "choices": [{
             "index": 0,
-            "message": {
-                "role": "assistant",
-                "content": tool_calls[0].reasoning or None,
-                "tool_calls": tc_list,
-            },
+            "message": message,
             "finish_reason": "tool_calls",
         }],
         "usage": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0},
@@ -172,24 +191,31 @@ def tool_calls_to_sse_events(
     tool_calls: list[ToolCall],
     model: str = "forge",
     usage: Any | None = None,
+    reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY,
 ) -> list[dict[str, Any]]:
     """Convert forge ToolCalls to a sequence of SSE chunk objects.
 
     Returns the complete list of chunk dicts ready to be formatted as
     SSE data lines. The caller handles the actual SSE wire format.
     """
+    reasoning_replay = validate_reasoning_replay(reasoning_replay)
     cmpl_id = f"chatcmpl-{uuid.uuid4().hex[:12]}"
     events: list[dict[str, Any]] = []
 
-    # If there's reasoning, send it as a content delta first
-    if tool_calls[0].reasoning:
+    reasoning = tool_calls[0].reasoning if tool_calls else None
+    if reasoning and reasoning_replay != "none":
+        delta: dict[str, Any] = {"role": "assistant"}
+        if reasoning_replay == "full":
+            delta["content"] = reasoning
+        else:
+            delta["reasoning_content"] = reasoning
         events.append({
             "id": cmpl_id,
             "object": "chat.completion.chunk",
             "model": model,
             "choices": [{
                 "index": 0,
-                "delta": {"role": "assistant", "content": tool_calls[0].reasoning},
+                "delta": delta,
                 "finish_reason": None,
             }],
         })
diff --git a/src/forge/proxy/convert_anthropic.py b/src/forge/proxy/convert_anthropic.py
index d7e5d55..df40f9b 100644
--- a/src/forge/proxy/convert_anthropic.py
+++ b/src/forge/proxy/convert_anthropic.py
@@ -11,6 +11,7 @@
 from typing import Any
 
 from forge.core.messages import Message, MessageMeta, MessageRole, MessageType, ToolCallInfo
+from forge.core.reasoning import DEFAULT_REASONING_REPLAY, ReasoningReplay, validate_reasoning_replay
 from forge.core.workflow import ToolCall, ToolSpec
 
 
@@ -233,11 +234,13 @@ def tool_calls_to_anthropic(
     tool_calls: list[ToolCall],
     model: str = "forge",
     usage: Any | None = None,
+    reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY,
 ) -> dict[str, Any]:
     """Convert forge ToolCalls to an Anthropic Messages API response object."""
+    reasoning_replay = validate_reasoning_replay(reasoning_replay)
     blocks: list[dict[str, Any]] = []
 
-    if tool_calls and tool_calls[0].reasoning:
+    if tool_calls and tool_calls[0].reasoning and reasoning_replay == "full":
         blocks.append({"type": "text", "text": tool_calls[0].reasoning})
 
     for tc in tool_calls:
@@ -284,6 +287,7 @@ def tool_calls_to_anthropic_sse(
     tool_calls: list[ToolCall],
     model: str = "forge",
     usage: Any | None = None,
+    reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY,
 ) -> list[dict[str, Any]]:
     """Build the Anthropic SSE event sequence for a tool-use response.
 
@@ -291,6 +295,7 @@ def tool_calls_to_anthropic_sse(
     formatter reads that to emit ``event: <type>`` lines. Spec:
     https://platform.claude.com/docs/en/build-with-claude/streaming
     """
+    reasoning_replay = validate_reasoning_replay(reasoning_replay)
     au = _anthropic_usage(usage)
     msg_id = f"msg_{uuid.uuid4().hex[:24]}"
     events: list[dict[str, Any]] = []
@@ -313,7 +318,7 @@ def tool_calls_to_anthropic_sse(
 
     # Reasoning text first, if present.
     reasoning = tool_calls[0].reasoning if tool_calls else None
-    if reasoning:
+    if reasoning and reasoning_replay == "full":
         events.append({
             "type": "content_block_start",
             "index": block_idx,
diff --git a/src/forge/proxy/handler.py b/src/forge/proxy/handler.py
index c19dcd2..3e3712c 100644
--- a/src/forge/proxy/handler.py
+++ b/src/forge/proxy/handler.py
@@ -8,7 +8,12 @@
 
 from forge.clients.base import LLMClient, format_tool
 from forge.context.manager import ContextManager
-from forge.core.inference import _get_usage, fold_and_serialize, run_inference
+from forge.core.inference import _get_usage, prepare_backend_messages, run_inference
+from forge.core.reasoning import (
+    DEFAULT_REASONING_REPLAY,
+    ReasoningReplay,
+    validate_reasoning_replay,
+)
 from forge.core.workflow import ToolCall, ToolSpec, TextResponse
 from forge.errors import ToolCallError
 from forge.guardrails import ErrorTracker, ResponseValidator
@@ -125,6 +130,7 @@ async def handle_chat_completions(
     native_passthrough: bool = True,
     inject_respond_tool: bool = False,
     protocol: Literal["openai", "anthropic"] = "openai",
+    reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY,
 ) -> dict[str, Any] | list[dict[str, Any]]:
     """Handle an inbound completions request.
 
@@ -153,11 +159,14 @@ async def handle_chat_completions(
             untouched unless explicitly opted in.
         protocol: Inbound wire format. ``openai`` for
             ``/v1/chat/completions``; ``anthropic`` for ``/v1/messages``.
+        reasoning_replay: How much captured reasoning to replay to the
+            backend and expose to clients.
 
     Returns:
         If stream=false: a single response dict (protocol-shaped).
         If stream=true: a list of SSE event dicts (protocol-shaped).
     """
+    reasoning_replay = validate_reasoning_replay(reasoning_replay)
     is_stream = body.get("stream", False)
     model_name = body.get("model", "forge")
 
@@ -229,7 +238,13 @@ async def handle_chat_completions(
     if not tool_specs:
         logger.info("No tools in request, passing through to backend")
         api_format = getattr(client, "api_format", "ollama")
-        api_messages = raw_messages_for_backend or fold_and_serialize(messages, api_format)
+        api_messages = prepare_backend_messages(
+            messages,
+            api_format,
+            reasoning_replay=reasoning_replay,
+            raw_openai_messages=raw_messages_for_backend,
+            use_raw_messages=raw_messages_for_backend is not None,
+        )
         response = await client.send(
             api_messages, tools=None, sampling=sampling, passthrough=passthrough,
             inbound_anthropic_body=inbound_anthropic_body,
@@ -256,6 +271,7 @@ async def handle_chat_completions(
             inbound_anthropic_body=inbound_anthropic_body,
             raw_openai_messages=raw_messages_for_backend,
             raw_openai_tools=raw_tools_for_backend,
+            reasoning_replay=reasoning_replay,
         )
     except ToolCallError as exc:
         # Retries exhausted — the model kept returning text instead of tool
@@ -289,7 +305,10 @@ async def handle_chat_completions(
     if other_calls:
         # Real tool calls (possibly mixed with respond) — return the
         # real tool calls only, drop respond.
-        return _emit_tool_calls(other_calls, model_name, protocol, is_stream, usage=usage)
+        return _emit_tool_calls(
+            other_calls, model_name, protocol, is_stream, usage=usage,
+            reasoning_replay=reasoning_replay,
+        )
 
     # Shouldn't happen, but handle empty tool_calls gracefully
     return _emit_text("", model_name, protocol, is_stream, usage=usage)
@@ -318,12 +337,22 @@ def _emit_tool_calls(
     protocol: str,
     is_stream: bool,
     usage: Any | None = None,
+    reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY,
 ) -> dict[str, Any] | list[dict[str, Any]]:
     """Protocol-aware tool-call response emitter."""
     if protocol == "anthropic":
         if is_stream:
-            return tool_calls_to_anthropic_sse(tool_calls, model=model, usage=usage)
-        return tool_calls_to_anthropic(tool_calls, model=model, usage=usage)
+            return tool_calls_to_anthropic_sse(
+                tool_calls, model=model, usage=usage,
+                reasoning_replay=reasoning_replay,
+            )
+        return tool_calls_to_anthropic(
+            tool_calls, model=model, usage=usage, reasoning_replay=reasoning_replay,
+        )
     if is_stream:
-        return tool_calls_to_sse_events(tool_calls, model=model, usage=usage)
-    return tool_calls_to_openai(tool_calls, model=model, usage=usage)
+        return tool_calls_to_sse_events(
+            tool_calls, model=model, usage=usage, reasoning_replay=reasoning_replay,
+        )
+    return tool_calls_to_openai(
+        tool_calls, model=model, usage=usage, reasoning_replay=reasoning_replay,
+    )
diff --git a/src/forge/proxy/proxy.py b/src/forge/proxy/proxy.py
index d77c0b4..a937324 100644
--- a/src/forge/proxy/proxy.py
+++ b/src/forge/proxy/proxy.py
@@ -20,6 +20,7 @@
 from forge.clients.vllm import VLLMClient
 from forge.context.manager import ContextManager
 from forge.context.strategies import TieredCompact
+from forge.core.reasoning import DEFAULT_REASONING_REPLAY, ReasoningReplay, validate_reasoning_replay
 from forge.proxy.server import HTTPServer
 from forge.server import BudgetMode, ServerManager, setup_backend
 
@@ -72,6 +73,7 @@ def __init__(
         inject_respond_tool: bool = False,
         backend_protocol: Literal["openai", "anthropic"] = "openai",
         backend_timeout: float = 300.0,
+        reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY,
     ) -> None:
         """
         Args:
@@ -118,6 +120,8 @@ def __init__(
                 Only meaningful in external mode; ignored in managed mode.
             backend_timeout: Timeout in seconds for requests from the proxy to
                 the downstream backend.
+            reasoning_replay: How much captured reasoning to replay to the
+                backend on later turns: ``full``, ``keep-last``, or ``none``.
         """
         if backend_url is None and backend is None:
             raise ValueError("Provide either backend_url (external) or backend (managed)")
@@ -177,6 +181,7 @@ def __init__(
         self._inject_respond_tool = inject_respond_tool
         self._backend_protocol = backend_protocol
         self._backend_timeout = backend_timeout
+        self._reasoning_replay = validate_reasoning_replay(reasoning_replay)
 
         # Auto-detect serialization: managed (no external url) = single local
         # GPU = serialize. External callers manage their own concurrency.
@@ -262,6 +267,7 @@ async def _async_start(self, ready: threading.Event) -> None:
             rescue_enabled=self._rescue_enabled,
             native_passthrough=self._backend_capability == "native",
             inject_respond_tool=self._inject_respond_tool,
+            reasoning_replay=self._reasoning_replay,
         )
         await self._http_server.start()
         self._started = True
diff --git a/src/forge/proxy/server.py b/src/forge/proxy/server.py
index 1a0328c..1f8e16a 100644
--- a/src/forge/proxy/server.py
+++ b/src/forge/proxy/server.py
@@ -15,6 +15,7 @@
 
 from forge.clients.base import LLMClient
 from forge.context.manager import ContextManager
+from forge.core.reasoning import DEFAULT_REASONING_REPLAY, ReasoningReplay, validate_reasoning_replay
 from forge.proxy.handler import handle_chat_completions
 
 logger = logging.getLogger("forge.proxy")
@@ -52,6 +53,7 @@ def __init__(
         rescue_enabled: bool = True,
         native_passthrough: bool = True,
         inject_respond_tool: bool = False,
+        reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY,
     ) -> None:
         self._client = client
         self._context_manager = context_manager
@@ -62,6 +64,7 @@ def __init__(
         self._rescue_enabled = rescue_enabled
         self._native_passthrough = native_passthrough
         self._inject_respond_tool = inject_respond_tool
+        self._reasoning_replay = validate_reasoning_replay(reasoning_replay)
         self._server: asyncio.Server | None = None
         self._serialize = serialize_requests
         self._queue: asyncio.Queue[_QueueItem] = asyncio.Queue()
@@ -316,6 +319,7 @@ async def _run_handler(
                 native_passthrough=self._native_passthrough,
                 inject_respond_tool=self._inject_respond_tool,
                 protocol=protocol,
+                reasoning_replay=self._reasoning_replay,
             )
         except Exception as exc:
             logger.exception("Handler error")
diff --git a/tests/unit/test_inference_passthrough.py b/tests/unit/test_inference_passthrough.py
index 78aa851..3829dd5 100644
--- a/tests/unit/test_inference_passthrough.py
+++ b/tests/unit/test_inference_passthrough.py
@@ -47,7 +47,22 @@ async def test_raw_used_on_first_attempt_folded_on_retry():
         MessageRole.USER, "folded-form",
         MessageMeta(MessageType.USER_INPUT),
     )]
-    raw_messages = [{"role": "user", "content": "VERBATIM", "name": "u1"}]
+    raw_messages = [
+        {
+            "role": "assistant",
+            "content": None,
+            "reasoning_content": "old",
+            "tool_calls": [],
+            "name": "a1",
+        },
+        {
+            "role": "assistant",
+            "content": None,
+            "reasoning_content": "latest",
+            "tool_calls": [],
+            "name": "a2",
+        },
+    ]
     raw_tools = [{"type": "function", "function": {"name": "search", "parameters": {}}}]
 
     result = await run_inference(
@@ -59,6 +74,7 @@ async def test_raw_used_on_first_attempt_folded_on_retry():
         tool_specs=[_search_spec()],
         raw_openai_messages=raw_messages,
         raw_openai_tools=raw_tools,
+        reasoning_replay="full",
     )
 
     assert result is not None
@@ -98,3 +114,48 @@ async def test_no_raw_falls_back_to_fold():
     call = client.send.call_args
     assert call.args[0][0]["content"] == "hello"
     assert "raw_openai_tools" not in call.kwargs
+
+
+@pytest.mark.asyncio
+async def test_non_full_reasoning_replay_filters_raw_reasoning_but_keeps_raw_shape():
+    client = _client([ToolCall(tool="search", args={})])
+    messages = [Message(
+        MessageRole.USER, "folded-form",
+        MessageMeta(MessageType.USER_INPUT),
+    )]
+    raw_messages = [
+        {
+            "role": "assistant",
+            "content": None,
+            "reasoning_content": "old",
+            "tool_calls": [],
+            "name": "a1",
+        },
+        {
+            "role": "assistant",
+            "content": None,
+            "reasoning_content": "latest",
+            "tool_calls": [],
+            "name": "a2",
+        },
+    ]
+    raw_tools = [{"type": "function", "function": {"name": "search", "parameters": {}}}]
+
+    await run_inference(
+        messages=messages,
+        client=client,
+        context_manager=_ctx(),
+        validator=ResponseValidator(["search"], rescue_enabled=True),
+        error_tracker=ErrorTracker(max_retries=1),
+        tool_specs=[_search_spec()],
+        raw_openai_messages=raw_messages,
+        raw_openai_tools=raw_tools,
+        reasoning_replay="keep-last",
+    )
+
+    call = client.send.call_args
+    assert call.args[0][0]["name"] == "a1"
+    assert "reasoning_content" not in call.args[0][0]
+    assert call.args[0][1]["name"] == "a2"
+    assert call.args[0][1]["reasoning_content"] == "latest"
+    assert call.kwargs["raw_openai_tools"] == raw_tools
diff --git a/tests/unit/test_proxy_convert.py b/tests/unit/test_proxy_convert.py
index 5ce0feb..11b5a2e 100644
--- a/tests/unit/test_proxy_convert.py
+++ b/tests/unit/test_proxy_convert.py
@@ -149,12 +149,28 @@ def test_multiple_tool_calls(self):
         ])
         assert len(result["choices"][0]["message"]["tool_calls"]) == 2
 
-    def test_reasoning_in_content(self):
+    def test_reasoning_default_exposed_as_reasoning_content(self):
         result = tool_calls_to_openai([
             ToolCall(tool="search", args={}, reasoning="Let me think..."),
         ])
+        msg = result["choices"][0]["message"]
+        assert msg["content"] is None
+        assert msg["reasoning_content"] == "Let me think..."
+
+    def test_full_reasoning_replay_exposes_reasoning_in_content(self):
+        result = tool_calls_to_openai([
+            ToolCall(tool="search", args={}, reasoning="Let me think..."),
+        ], reasoning_replay="full")
         assert result["choices"][0]["message"]["content"] == "Let me think..."
 
+    def test_none_reasoning_replay_omits_reasoning(self):
+        result = tool_calls_to_openai([
+            ToolCall(tool="search", args={}, reasoning="Let me think..."),
+        ], reasoning_replay="none")
+        msg = result["choices"][0]["message"]
+        assert msg["content"] is None
+        assert "reasoning_content" not in msg
+
     def test_no_reasoning_content_is_none(self):
         result = tool_calls_to_openai([ToolCall(tool="search", args={})])
         assert result["choices"][0]["message"]["content"] is None
@@ -200,14 +216,27 @@ def test_single_tool_call_structure(self):
         assert events[-1]["choices"][0]["finish_reason"] == "tool_calls"
         assert events[-1]["choices"][0]["delta"] == {}
 
-    def test_reasoning_prepended(self):
+    def test_reasoning_prepended_as_reasoning_content_by_default(self):
         events = tool_calls_to_sse_events([
             ToolCall(tool="search", args={}, reasoning="Thinking..."),
         ])
         # reasoning delta + tool call delta + final
         assert len(events) == 3
+        assert events[0]["choices"][0]["delta"]["reasoning_content"] == "Thinking..."
+
+    def test_full_reasoning_replay_streams_content_delta(self):
+        events = tool_calls_to_sse_events([
+            ToolCall(tool="search", args={}, reasoning="Thinking..."),
+        ], reasoning_replay="full")
         assert events[0]["choices"][0]["delta"]["content"] == "Thinking..."
 
+    def test_none_reasoning_replay_omits_stream_reasoning_delta(self):
+        events = tool_calls_to_sse_events([
+            ToolCall(tool="search", args={}, reasoning="Thinking..."),
+        ], reasoning_replay="none")
+        assert len(events) == 2
+        assert "tool_calls" in events[0]["choices"][0]["delta"]
+
     def test_multiple_tool_calls(self):
         events = tool_calls_to_sse_events([
             ToolCall(tool="a", args={}),
@@ -254,3 +283,32 @@ def test_consistent_completion_id(self):
         events = text_to_sse_events("test", chunk_size=1)
         ids = {e["id"] for e in events}
         assert len(ids) == 1
+
+
+class TestOpenaiReasoningFields:
+    def test_reasoning_content_becomes_reasoning_message(self):
+        msgs = openai_to_messages([{
+            "role": "assistant",
+            "content": None,
+            "reasoning_content": "Think.",
+            "tool_calls": [{
+                "id": "call_1",
+                "function": {"name": "search", "arguments": "{}"},
+            }],
+        }])
+
+        assert [m.metadata.type for m in msgs] == [
+            MessageType.REASONING, MessageType.TOOL_CALL,
+        ]
+        assert msgs[0].content == "Think."
+        assert msgs[1].content == ""
+
+    def test_reasoning_only_message_does_not_add_blank_text_response(self):
+        msgs = openai_to_messages([{
+            "role": "assistant",
+            "content": None,
+            "reasoning_content": "Think.",
+        }])
+
+        assert len(msgs) == 1
+        assert msgs[0].metadata.type == MessageType.REASONING
diff --git a/tests/unit/test_proxy_convert_anthropic.py b/tests/unit/test_proxy_convert_anthropic.py
index d7b9fa6..23fa62a 100644
--- a/tests/unit/test_proxy_convert_anthropic.py
+++ b/tests/unit/test_proxy_convert_anthropic.py
@@ -250,11 +250,18 @@ def test_shape(self):
         assert tu_blocks[0]["input"] == {"city": "Paris"}
         assert tu_blocks[0]["id"].startswith("toolu_")
 
-    def test_reasoning_emitted_as_text_block(self):
+    def test_default_omits_reasoning_text_block(self):
         result = tool_calls_to_anthropic([
             ToolCall(tool="search", args={"q": "x"}, reasoning="Let me search."),
         ])
         text_blocks = [b for b in result["content"] if b["type"] == "text"]
+        assert text_blocks == []
+
+    def test_full_reasoning_replay_emits_reasoning_as_text_block(self):
+        result = tool_calls_to_anthropic([
+            ToolCall(tool="search", args={"q": "x"}, reasoning="Let me search."),
+        ], reasoning_replay="full")
+        text_blocks = [b for b in result["content"] if b["type"] == "text"]
         assert text_blocks and text_blocks[0]["text"] == "Let me search."
 
     def test_multiple_tool_calls(self):
@@ -298,10 +305,17 @@ def test_event_sequence(self):
         delta = next(e for e in events if e["type"] == "message_delta")
         assert delta["delta"]["stop_reason"] == "tool_use"
 
-    def test_reasoning_block_precedes_tool_use(self):
+    def test_default_omits_reasoning_stream_block(self):
         events = tool_calls_to_anthropic_sse([
             ToolCall(tool="search", args={"q": "x"}, reasoning="Hmm."),
         ])
+        starts = [e for e in events if e["type"] == "content_block_start"]
+        assert starts[0]["content_block"]["type"] == "tool_use"
+
+    def test_full_reasoning_replay_streams_text_block_before_tool_use(self):
+        events = tool_calls_to_anthropic_sse([
+            ToolCall(tool="search", args={"q": "x"}, reasoning="Hmm."),
+        ], reasoning_replay="full")
         # First content_block_start should be type=text (reasoning), then tool_use
         starts = [e for e in events if e["type"] == "content_block_start"]
         assert starts[0]["content_block"]["type"] == "text"
diff --git a/tests/unit/test_proxy_handler.py b/tests/unit/test_proxy_handler.py
index 11de5ff..c4689c4 100644
--- a/tests/unit/test_proxy_handler.py
+++ b/tests/unit/test_proxy_handler.py
@@ -493,8 +493,39 @@ async def test_system_top_level_flows_into_messages(self):
 
 
 class TestNativePassthrough:
-    """The proxy forwards the client's OpenAI tools/messages verbatim on the
-    clean first attempt, bypassing the lossy ToolSpec round-trip."""
+    """Native proxy passthrough keeps raw tools by default; raw messages
+    are forwarded only when full reasoning replay preserves old behavior."""
+
+    @pytest.mark.asyncio
+    async def test_default_reasoning_replay_filters_raw_reasoning_only(self):
+        client = _mock_client([ToolCall(tool="search", args={"q": "x"})])
+        messages = [
+            {
+                "role": "assistant",
+                "content": None,
+                "reasoning_content": "old",
+                "tool_calls": [],
+                "name": "a1",
+                "vendor": {"kept": True},
+            },
+            {
+                "role": "assistant",
+                "content": None,
+                "reasoning_content": "latest",
+                "tool_calls": [],
+                "name": "a2",
+            },
+        ]
+        await handle_chat_completions(
+            _body(messages=messages, tools=[_tool_def("search")]),
+            client, _context_manager(),
+        )
+        sent_messages = client.send.call_args.args[0]
+        assert sent_messages[0]["name"] == "a1"
+        assert sent_messages[0]["vendor"] == {"kept": True}
+        assert "reasoning_content" not in sent_messages[0]
+        assert sent_messages[1]["name"] == "a2"
+        assert sent_messages[1]["reasoning_content"] == "latest"
 
     @pytest.mark.asyncio
     async def test_raw_tools_forwarded_verbatim(self):
@@ -525,7 +556,7 @@ async def test_raw_messages_forwarded_verbatim(self):
         messages = [{"role": "user", "content": "hi", "name": "u1"}]
         await handle_chat_completions(
             _body(messages=messages, tools=[_tool_def("search")]),
-            client, _context_manager(),
+            client, _context_manager(), reasoning_replay="full",
         )
         sent_messages = client.send.call_args.args[0]
         assert sent_messages == messages
diff --git a/tests/unit/test_reasoning_replay.py b/tests/unit/test_reasoning_replay.py
new file mode 100644
index 0000000..96868cc
--- /dev/null
+++ b/tests/unit/test_reasoning_replay.py
@@ -0,0 +1,136 @@
+"""Tests for reasoning replay policy serialization."""
+
+import pytest
+
+from forge.core.inference import fold_and_serialize, prepare_backend_messages
+from forge.core.messages import Message, MessageMeta, MessageRole, MessageType, ToolCallInfo
+from forge.core.reasoning import filter_openai_reasoning_messages, validate_reasoning_replay
+
+
+def _reasoning(text: str) -> Message:
+    return Message(MessageRole.ASSISTANT, text, MessageMeta(MessageType.REASONING))
+
+
+def _tool_call(name: str) -> Message:
+    return Message(
+        MessageRole.ASSISTANT, "", MessageMeta(MessageType.TOOL_CALL),
+        tool_calls=[ToolCallInfo(name=name, args={}, call_id=f"call_{name}")],
+    )
+
+
+def test_full_replays_every_reasoning_block():
+    messages = [
+        _reasoning("first"), _tool_call("a"),
+        _reasoning("second"), _tool_call("b"),
+    ]
+
+    result = fold_and_serialize(messages, "openai", reasoning_replay="full")
+
+    assert [m["content"] for m in result] == ["first", "second"]
+
+
+def test_keep_last_replays_only_latest_reasoning_block():
+    messages = [
+        _reasoning("first"), _tool_call("a"),
+        _reasoning("second"), _tool_call("b"),
+    ]
+
+    result = fold_and_serialize(messages, "openai", reasoning_replay="keep-last")
+
+    assert [m["content"] for m in result] == ["", "second"]
+
+
+def test_none_replays_no_reasoning_blocks():
+    messages = [
+        _reasoning("first"), _tool_call("a"),
+        _reasoning("second"), _tool_call("b"),
+    ]
+
+    result = fold_and_serialize(messages, "openai", reasoning_replay="none")
+
+    assert [m["content"] for m in result] == ["", ""]
+
+
+def test_keep_last_orphan_reasoning_is_preserved_as_orphan():
+    messages = [_reasoning("first"), _tool_call("a"), _reasoning("orphan")]
+
+    result = fold_and_serialize(messages, "openai", reasoning_replay="keep-last")
+
+    assert result[-1] == {"role": "assistant", "content": "orphan"}
+
+
+def test_validate_reasoning_replay_rejects_unknown_policy():
+    with pytest.raises(ValueError, match="reasoning_replay must be one of"):
+        validate_reasoning_replay("latest")
+
+
+def test_filter_openai_reasoning_messages_only_filters_assistant_messages():
+    messages = [
+        {
+            "role": "user",
+            "content": "keep this",
+            "reasoning_content": "user metadata",
+        },
+        {
+            "role": "assistant",
+            "content": None,
+            "reasoning_content": "old assistant reasoning",
+            "name": "a1",
+        },
+        {
+            "role": "assistant",
+            "content": None,
+            "reasoning_content": "latest assistant reasoning",
+            "name": "a2",
+        },
+    ]
+
+    result = filter_openai_reasoning_messages(messages, reasoning_replay="keep-last")
+
+    assert result[0]["reasoning_content"] == "user metadata"
+    assert result[1]["name"] == "a1"
+    assert "reasoning_content" not in result[1]
+    assert result[2]["name"] == "a2"
+    assert result[2]["reasoning_content"] == "latest assistant reasoning"
+
+
+def test_filter_openai_reasoning_messages_none_preserves_user_reasoning_fields():
+    messages = [
+        {"role": "user", "content": "keep", "reasoning": "user value"},
+        {"role": "assistant", "content": None, "reasoning": "drop"},
+    ]
+
+    result = filter_openai_reasoning_messages(messages, reasoning_replay="none")
+
+    assert result[0]["reasoning"] == "user value"
+    assert "reasoning" not in result[1]
+
+
+def test_prepare_backend_messages_filters_raw_openai_reasoning():
+    raw_messages = [
+        {"role": "assistant", "content": None, "reasoning_content": "old", "name": "a1"},
+        {"role": "assistant", "content": None, "reasoning_content": "latest", "name": "a2"},
+    ]
+
+    result = prepare_backend_messages(
+        [],
+        "openai",
+        raw_openai_messages=raw_messages,
+        use_raw_messages=True,
+        reasoning_replay="keep-last",
+    )
+
+    assert result[0]["name"] == "a1"
+    assert "reasoning_content" not in result[0]
+    assert result[1]["name"] == "a2"
+    assert result[1]["reasoning_content"] == "latest"
+
+
+def test_prepare_backend_messages_folds_forge_history_without_raw_messages():
+    messages = [_reasoning("first"), _tool_call("a"), _reasoning("second"), _tool_call("b")]
+
+    result = prepare_backend_messages(
+        messages, "openai", reasoning_replay="keep-last",
+    )
+
+    assert [m["content"] for m in result] == ["", "second"]

From 9fd2508318bec222f0e9213c8b70b9062491b99e Mon Sep 17 00:00:00 2001
From: Antoine Zambelli <antoine.zambelli@gmail.com>
Date: Tue, 2 Jun 2026 11:55:49 -0500
Subject: [PATCH 02/14] eval: thread reasoning_replay through batch_eval +
 eval_runner

Make reasoning_replay a first-class, resumable, recorded eval axis so the
re-sweep can run none-on-all (regression) and keep-last/full on reasoning
models without collisions.

- EvalConfig gains reasoning_replay; run_scenario passes it to WorkflowRunner
  (the backend pipeline already consumes it). run_eval propagates it.
- batch_eval: run-wide --reasoning-replay choice (mirrors --ablation),
  threaded into run_batch, every EvalConfig, and the JSONL row.
- Centralize the 6 inline resume keys into _run_key(); reasoning_replay is
  now part of the key so distinct policies for the same model+scenario are
  independent runs. _count_completed_runs defaults pre-knob rows (no field)
  to keep-last, so old dumps resume cleanly under the default.
- Both CLIs expose --reasoning-replay {full,keep-last,none}; banners print it.
- Add test_batch_eval_resume.py covering key distinctness, row recording,
  and per-policy resume counting incl. the legacy-default fold.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 tests/eval/batch_eval.py             | 86 ++++++++++++++++++-------
 tests/eval/eval_runner.py            | 14 +++++
 tests/unit/test_batch_eval_resume.py | 93 ++++++++++++++++++++++++++++
 3 files changed, 171 insertions(+), 22 deletions(-)
 create mode 100644 tests/unit/test_batch_eval_resume.py

diff --git a/tests/eval/batch_eval.py b/tests/eval/batch_eval.py
index 4417f17..5e2084d 100644
--- a/tests/eval/batch_eval.py
+++ b/tests/eval/batch_eval.py
@@ -19,6 +19,7 @@
 from pathlib import Path
 from typing import Any
 
+from forge.core.reasoning import DEFAULT_REASONING_REPLAY, REASONING_REPLAY_CHOICES, ReasoningReplay
 from forge.server import BudgetMode, ServerManager
 
 from tests.eval.ablation import ABLATION_PRESETS, AblationConfig
@@ -215,15 +216,39 @@ def _config_key(model: str, backend: str, mode: str) -> str:
     return f"{model}|{backend}|{mode}"
 
 
+def _run_key(
+    model: str,
+    backend: str,
+    mode: str,
+    ablation_name: str,
+    tool_choice: str,
+    reasoning_replay: str,
+    scenario: str,
+) -> str:
+    """Canonical per-run resume key.
+
+    Single source of truth for the resume/dedup dimensions so the counting
+    pass and every run-loop lookup stay in lockstep. reasoning_replay is part
+    of the key: distinct policies (none/keep-last/full) on the same
+    model+scenario are independent runs and must not collide.
+    """
+    return (
+        f"{model}|{backend}|{mode}"
+        f"|{ablation_name}|{tool_choice}|{reasoning_replay}|{scenario}"
+    )
+
+
 def _count_completed_runs(
     jsonl_path: Path,
     ablation_name: str = "reforged",
 ) -> dict[str, int]:
-    """Scan JSONL and count completed runs per (model, backend, mode, ablation, tool_choice, scenario).
+    """Scan JSONL and count completed runs per resume key (see ``_run_key``).
 
-    Returns dict mapping "model|backend|mode|ablation|tool_choice|scenario" → count.
-    Records without an ablation field are treated as "reforged".
-    Records without a tool_choice field are treated as "auto".
+    Returns dict mapping the canonical run key → count. Records without an
+    ablation field are treated as "reforged", without tool_choice as "auto",
+    and without reasoning_replay as the default policy (keep-last) — so
+    pre-knob dumps resume cleanly under the default and are re-run under a
+    different policy.
     """
     counts: dict[str, int] = {}
     if not jsonl_path.exists():
@@ -241,9 +266,10 @@ def _count_completed_runs(
             if row_ablation != ablation_name:
                 continue
             row_tc = row.get("tool_choice", "auto")
-            key = (
-                f"{row['model']}|{row['backend']}|{row['mode']}"
-                f"|{row_ablation}|{row_tc}|{row['scenario']}"
+            row_rr = row.get("reasoning_replay", DEFAULT_REASONING_REPLAY)
+            key = _run_key(
+                row["model"], row["backend"], row["mode"],
+                row_ablation, row_tc, row_rr, row["scenario"],
             )
             counts[key] = counts.get(key, 0) + 1
     return counts
@@ -256,6 +282,7 @@ def _run_result_to_row(
     run_idx: int,
     budget_tokens: int | None = None,
     ablation_name: str = "reforged",
+    reasoning_replay: str = DEFAULT_REASONING_REPLAY,
 ) -> dict[str, Any]:
     """Convert a RunResult into a flat dict for JSONL output."""
     row: dict[str, Any] = {
@@ -264,6 +291,7 @@ def _run_result_to_row(
         "mode": config.mode,
         "ablation": ablation_name,
         "tool_choice": config.tool_choice or "auto",
+        "reasoning_replay": reasoning_replay,
         "scenario": result.scenario_name,
         "run": run_idx,
         "completeness": result.completeness,
@@ -563,6 +591,7 @@ async def run_batch(
     tags: list[str] | None = None,
     scenario_names: list[str] | None = None,
     ablation: AblationConfig | None = None,
+    reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY,
 ) -> None:
     """Run all configs × scenarios, appending each result to JSONL.
 
@@ -602,9 +631,9 @@ async def run_batch(
             )
             if scenario.name in _COMPACTION_SCENARIOS and skip_compaction:
                 continue
-            key = (
-                f"{config.model}|{config.backend}|{config.mode}"
-                f"|{ablation_name}|{tc_label_pre}|{scenario.name}"
+            key = _run_key(
+                config.model, config.backend, config.mode,
+                ablation_name, tc_label_pre, reasoning_replay, scenario.name,
             )
             existing = completed_counts.get(key, 0)
             total_expected += max(0, runs_per_scenario - existing)
@@ -642,9 +671,9 @@ async def run_batch(
                     if scenario.name in _COMPACTION_SCENARIOS and skip_compaction:
                         print(f"  {scenario.name}: SKIP (compaction N/A)")
                         continue
-                    key = (
-                        f"{config.model}|{config.backend}|{config.mode}"
-                        f"|{ablation_name}|{tc_label}|{scenario.name}"
+                    key = _run_key(
+                        config.model, config.backend, config.mode,
+                        ablation_name, tc_label, reasoning_replay, scenario.name,
                     )
                     existing = completed_counts.get(key, 0)
                     remaining = max(0, runs_per_scenario - existing)
@@ -674,9 +703,9 @@ async def run_batch(
                         total_skipped += 1
                         continue
 
-                    key = (
-                        f"{config.model}|{config.backend}|{config.mode}"
-                        f"|{ablation_name}|{tc_label}|{scenario.name}"
+                    key = _run_key(
+                        config.model, config.backend, config.mode,
+                        ablation_name, tc_label, reasoning_replay, scenario.name,
                     )
                     existing = completed_counts.get(key, 0)
                     remaining = max(0, runs_per_scenario - existing)
@@ -693,6 +722,7 @@ async def run_batch(
                         keep_message_history=True,
                         verbose=verbose,
                         budget_override=scenario_budget,
+                        reasoning_replay=reasoning_replay,
                     )
 
                     eta = _format_eta(total_ran, total_expected, batch_start)
@@ -732,9 +762,9 @@ async def run_batch(
                 )
                 if scenario.name in _COMPACTION_SCENARIOS and skip_compaction:
                     continue
-                key_check = (
-                    f"{config.model}|{config.backend}|{config.mode}"
-                    f"|{ablation_name}|{tc_label}|{scenario.name}"
+                key_check = _run_key(
+                    config.model, config.backend, config.mode,
+                    ablation_name, tc_label, reasoning_replay, scenario.name,
                 )
                 if completed_counts.get(key_check, 0) < runs_per_scenario:
                     has_work = True
@@ -816,9 +846,9 @@ async def run_batch(
                     total_skipped += 1
                     continue
 
-                key = (
-                    f"{config.model}|{config.backend}|{config.mode}"
-                    f"|{ablation_name}|{tc_label}|{scenario.name}"
+                key = _run_key(
+                    config.model, config.backend, config.mode,
+                    ablation_name, tc_label, reasoning_replay, scenario.name,
                 )
                 existing = completed_counts.get(key, 0)
                 remaining = max(0, runs_per_scenario - existing)
@@ -843,6 +873,7 @@ async def run_batch(
                     verbose=verbose,
                     budget_override=scenario_budget,
                     strategy_overrides={"compaction": TieredCompact(keep_recent=2)},
+                    reasoning_replay=reasoning_replay,
                 )
 
                 eta = _format_eta(total_ran, total_expected, batch_start)
@@ -900,6 +931,7 @@ async def run_batch(
                         result, config, scenario, run_idx + 1,
                         budget_tokens=scenario_budget,
                         ablation_name=ablation_name,
+                        reasoning_replay=reasoning_replay,
                     )
                     with output_path.open("a") as f:
                         f.write(json.dumps(row) + "\n")
@@ -970,6 +1002,14 @@ async def main() -> None:
         default="reforged",
         help="Ablation preset: selectively disable guardrails (default: reforged = all enabled)",
     )
+    parser.add_argument(
+        "--reasoning-replay",
+        choices=list(REASONING_REPLAY_CHOICES),
+        default=DEFAULT_REASONING_REPLAY,
+        help="How much captured reasoning to replay to the backend each turn: "
+        "full (legacy), keep-last (default), none. Part of the resume key, so "
+        "distinct policies for the same model/scenario are independent runs.",
+    )
     parser.add_argument(
         "--model",
         type=str,
@@ -1009,6 +1049,7 @@ async def main() -> None:
     print(f"  Config set:    {args.config} ({len(configs)} configs)")
     print(f"  Budget mode:   {budget_mode.value}")
     print(f"  Ablation:      {ablation.name}")
+    print(f"  Reasoning replay: {args.reasoning_replay}")
     if args.scenario:
         print(f"  Scenarios:     {', '.join(args.scenario)}")
     elif args.tags:
@@ -1033,6 +1074,7 @@ async def main() -> None:
         tags=args.tags,
         scenario_names=args.scenario,
         ablation=ablation,
+        reasoning_replay=args.reasoning_replay,
     )
 
 
diff --git a/tests/eval/eval_runner.py b/tests/eval/eval_runner.py
index b503594..77707b9 100644
--- a/tests/eval/eval_runner.py
+++ b/tests/eval/eval_runner.py
@@ -13,6 +13,7 @@
 from forge.context.manager import CompactEvent, ContextManager
 from forge.context.strategies import CompactStrategy, NoCompact, SlidingWindowCompact, TieredCompact
 from forge.core.messages import Message, MessageType
+from forge.core.reasoning import DEFAULT_REASONING_REPLAY, REASONING_REPLAY_CHOICES, ReasoningReplay
 from forge.core.runner import WorkflowRunner
 from forge.core.workflow import ToolCall, ToolDef, ToolSpec, Workflow
 from forge.errors import ForgeError, StreamError
@@ -63,6 +64,7 @@ class EvalConfig:
     verbose: bool = False
     budget_override: int | None = None
     stream_retries: int = 2
+    reasoning_replay: ReasoningReplay = DEFAULT_REASONING_REPLAY
 
 
 class CountingClientWrapper:
@@ -279,6 +281,7 @@ def on_message(msg: Message) -> None:
         stream=config.stream,
         on_message=on_message,
         rescue_enabled=rescue_enabled,
+        reasoning_replay=config.reasoning_replay,
     )
 
     start = time.monotonic()
@@ -435,6 +438,7 @@ async def run_eval(
             verbose=config.verbose,
             budget_override=scenario_budget,
             stream_retries=config.stream_retries,
+            reasoning_replay=config.reasoning_replay,
         )
 
         scenario_results: list[RunResult] = []
@@ -548,6 +552,13 @@ async def main() -> None:
         default="reforged",
         help="Ablation preset: selectively disable guardrails (default: reforged = all enabled)",
     )
+    parser.add_argument(
+        "--reasoning-replay",
+        choices=list(REASONING_REPLAY_CHOICES),
+        default=DEFAULT_REASONING_REPLAY,
+        help="How much captured reasoning to replay to the backend each turn: "
+        "full (legacy: replay all), keep-last (default: only most recent), none (drop all).",
+    )
     parser.add_argument(
         "--tool-choice",
         choices=["auto", "any"],
@@ -654,6 +665,7 @@ async def main() -> None:
             budget_override=resolved_budget,
             compact_strategy=cli_strategy,
             strategy_overrides={},
+            reasoning_replay=args.reasoning_replay,
         )
     else:
         config = EvalConfig(
@@ -665,6 +677,7 @@ async def main() -> None:
             strategy_overrides={
                 "compaction": TieredCompact(keep_recent=2),
             },
+            reasoning_replay=args.reasoning_replay,
         )
 
     ablation = ABLATION_PRESETS[args.ablation]
@@ -677,6 +690,7 @@ async def main() -> None:
     print(f"Resolved budget: {resolved_budget} tokens")
     print(f"Compact strategy: {strategy_label}")
     print(f"Ablation: {ablation.name}")
+    print(f"Reasoning replay: {args.reasoning_replay}")
     print(f"Tags filter: {args.tags or 'all'}")
     print(f"Scenario filter: {args.scenario or 'all'}")
     print()
diff --git a/tests/unit/test_batch_eval_resume.py b/tests/unit/test_batch_eval_resume.py
new file mode 100644
index 0000000..8ce3f73
--- /dev/null
+++ b/tests/unit/test_batch_eval_resume.py
@@ -0,0 +1,93 @@
+"""Resume-key behavior for the reasoning_replay eval axis (batch_eval).
+
+reasoning_replay is part of the canonical run key: distinct policies
+(none / keep-last / full) on the same model+scenario are independent runs
+and must not collide in resume counting, or a multi-policy sweep would
+under-count and skip work it never actually ran.
+"""
+
+from __future__ import annotations
+
+import json
+
+from forge.core.reasoning import DEFAULT_REASONING_REPLAY
+
+from tests.eval.batch_eval import (
+    BatchConfig,
+    _count_completed_runs,
+    _run_key,
+    _run_result_to_row,
+)
+from tests.eval.eval_runner import RunResult
+from tests.eval.scenarios import basic_2step
+
+
+def _row(model: str, scenario: str, reasoning_replay: str) -> dict:
+    """Build a JSONL row via the production path for a given policy."""
+    cfg = BatchConfig(model=model, backend="llamaserver", mode="native", think=None)
+    res = RunResult(
+        scenario_name=scenario,
+        completeness=True,
+        iterations_used=3,
+        accuracy=True,
+        messages=None,
+    )
+    return _run_result_to_row(
+        res, cfg, basic_2step, run_idx=1,
+        ablation_name="reforged", reasoning_replay=reasoning_replay,
+    )
+
+
+def test_run_key_distinguishes_reasoning_replay() -> None:
+    base = dict(
+        model="m", backend="llamaserver", mode="native",
+        ablation_name="reforged", tool_choice="auto", scenario="s",
+    )
+    k_none = _run_key(reasoning_replay="none", **base)
+    k_keep = _run_key(reasoning_replay="keep-last", **base)
+    k_full = _run_key(reasoning_replay="full", **base)
+
+    # All three policies yield distinct keys...
+    assert len({k_none, k_keep, k_full}) == 3
+    # ...and the key is stable for the same inputs.
+    assert _run_key(reasoning_replay="none", **base) == k_none
+    assert "none" in k_none
+
+
+def test_run_result_to_row_records_reasoning_replay() -> None:
+    row = _row("M", "sc", "none")
+    assert row["reasoning_replay"] == "none"
+
+    # Default when the caller doesn't pass one (legacy callers / inert axis).
+    cfg = BatchConfig(model="M", backend="llamaserver", mode="native", think=None)
+    res = RunResult(scenario_name="sc", completeness=True, iterations_used=2, messages=None)
+    default_row = _run_result_to_row(res, cfg, basic_2step, run_idx=1)
+    assert default_row["reasoning_replay"] == DEFAULT_REASONING_REPLAY
+
+
+def test_count_completed_runs_separates_policies(tmp_path) -> None:
+    rows = [
+        _row("M", "sc", "none"),
+        _row("M", "sc", "none"),
+        _row("M", "sc", "full"),
+        _row("M", "sc", "keep-last"),
+    ]
+    # A pre-knob row (no reasoning_replay field) must fold into the default
+    # policy, so a default-policy resume skips it and a different policy re-runs.
+    legacy = _row("M", "sc", "keep-last")
+    del legacy["reasoning_replay"]
+    rows.append(legacy)
+
+    path = tmp_path / "results.jsonl"
+    path.write_text("\n".join(json.dumps(r) for r in rows) + "\n")
+
+    counts = _count_completed_runs(path, ablation_name="reforged")
+
+    def key(rr: str) -> str:
+        return _run_key("M", "llamaserver", "native", "reforged", "auto", rr, "sc")
+
+    assert counts[key("none")] == 2
+    assert counts[key("full")] == 1
+    # explicit keep-last + the legacy row defaulting to keep-last
+    assert counts[key("keep-last")] == 2
+    assert counts[key("none")] + counts[key("full")] + counts[key("keep-last")] == 5

From 5b77d0964679bf65d236d45e0535e3060dd59876 Mon Sep 17 00:00:00 2001
From: Antoine Zambelli <antoine.zambelli@gmail.com>
Date: Tue, 2 Jun 2026 18:49:22 -0500
Subject: [PATCH 03/14] eval: add on-wire reasoning counter to validate replay
 knob

Add count_wire_reasoning() (eval-only, no src change): serialize the
recorded transcript through the real fold_and_serialize choke point and
count which reasoning blocks survive onto the backend wire. Emit
reasoning_wire (survived) and reasoning_wire_total (non-empty blocks)
per batch_eval row, so the sweep records an actual replay rate.

Validated on a reasoning model (N=10, all 3 policies, 26 scenarios):
none -> 0 on the wire across all 260 rows; keep-last in {0,1};
full in [0, total]. Surfaces that legacy/full replay is itself lossy
(~29% of generated reasoning reaches the wire) due to consecutive-block
collapse and empty-reasoning omission in fold_and_serialize.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 tests/eval/batch_eval.py            | 11 ++++++-
 tests/eval/metrics.py               | 45 ++++++++++++++++++++++++++++-
 tests/unit/test_reasoning_replay.py | 32 ++++++++++++++++++++
 3 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/tests/eval/batch_eval.py b/tests/eval/batch_eval.py
index 5e2084d..d588b25 100644
--- a/tests/eval/batch_eval.py
+++ b/tests/eval/batch_eval.py
@@ -24,7 +24,7 @@
 
 from tests.eval.ablation import ABLATION_PRESETS, AblationConfig
 from tests.eval.eval_runner import EvalConfig, RunResult, run_scenario
-from tests.eval.metrics import analyze_history, compute_metrics
+from tests.eval.metrics import analyze_history, compute_metrics, count_wire_reasoning
 from tests.eval.scenarios import ALL_SCENARIOS, EvalScenario
 
 # ── GGUF paths ──────────────────────────────────────────────────
@@ -313,11 +313,20 @@ def _run_result_to_row(
         row["step_nudges"] = stats.step_nudges
         row["tool_errors"] = stats.tool_errors
         row["reasoning_msgs"] = stats.reasoning_messages
+        # On-wire reasoning that survives the replay policy (independent
+        # validation of the knob): none->0, keep-last->{0,1}, full->[0,total].
+        # reasoning_wire_total is the denominator (non-empty reasoning blocks),
+        # so reasoning_wire / reasoning_wire_total is the actual replay rate.
+        wire_survived, wire_total = count_wire_reasoning(result.messages, reasoning_replay)
+        row["reasoning_wire"] = wire_survived
+        row["reasoning_wire_total"] = wire_total
     else:
         row["retry_nudges"] = None
         row["step_nudges"] = None
         row["tool_errors"] = None
         row["reasoning_msgs"] = None
+        row["reasoning_wire"] = None
+        row["reasoning_wire_total"] = None
 
     # Correctness
     row["accuracy"] = result.accuracy
diff --git a/tests/eval/metrics.py b/tests/eval/metrics.py
index 65d8616..bd97538 100644
--- a/tests/eval/metrics.py
+++ b/tests/eval/metrics.py
@@ -2,10 +2,13 @@
 
 from __future__ import annotations
 
+import json
 from dataclasses import dataclass, field
 from typing import TYPE_CHECKING
 
-from forge.core.messages import Message, MessageType
+from forge.core.inference import fold_and_serialize
+from forge.core.messages import Message, MessageRole, MessageType
+from forge.core.reasoning import ReasoningReplay
 
 if TYPE_CHECKING:
     from tests.eval.eval_runner import RunResult
@@ -51,6 +54,46 @@ def analyze_history(messages: list[Message]) -> HistoryStats:
     return stats
 
 
+def count_wire_reasoning(
+    messages: list[Message],
+    reasoning_replay: ReasoningReplay,
+    api_format: str = "openai",
+) -> tuple[int, int]:
+    """Count reasoning messages whose text actually survives onto the backend wire.
+
+    This is an *independent* validation of the reasoning-replay knob, not a
+    reimplementation of it: we serialize the recorded transcript through the
+    real production serializer (``fold_and_serialize`` — the single replay-policy
+    choke point) and then check which of the run's REASONING contents are present
+    in the resulting payload. Returns ``(survived, total)``.
+
+    Expected by policy on a transcript with N>0 reasoning messages:
+      * ``none``      -> survived == 0   (knob strips all reasoning from the wire)
+      * ``keep-last`` -> survived == 1   (only the final reasoning is folded)
+      * ``full``      -> survived == N   (every reasoning is folded, legacy behavior)
+
+    Semantics note: this re-derives the *final-snapshot* wire payload, so for
+    ``keep-last`` it reflects the last reasoning in the completed history, not the
+    cumulative count sent turn-by-turn. ``none``==0 is exact for every prefix
+    because the drop is unconditional. Reasoning survival is independent of
+    ``api_format`` (the drop happens before ``to_api_dict``), so the default is fine.
+    """
+    reasoning_texts = [
+        m.content
+        for m in messages
+        if m.metadata.type == MessageType.REASONING
+        and m.role == MessageRole.ASSISTANT
+        and m.content
+    ]
+    total = len(reasoning_texts)
+    if total == 0:
+        return 0, 0
+    wire = fold_and_serialize(messages, api_format, reasoning_replay=reasoning_replay)
+    blob = json.dumps(wire, ensure_ascii=False)
+    survived = sum(1 for text in reasoning_texts if text in blob)
+    return survived, total
+
+
 # ── Aggregated metrics ───────────────────────────────────────────
 
 
diff --git a/tests/unit/test_reasoning_replay.py b/tests/unit/test_reasoning_replay.py
index 96868cc..c9ec1e5 100644
--- a/tests/unit/test_reasoning_replay.py
+++ b/tests/unit/test_reasoning_replay.py
@@ -6,6 +6,8 @@
 from forge.core.messages import Message, MessageMeta, MessageRole, MessageType, ToolCallInfo
 from forge.core.reasoning import filter_openai_reasoning_messages, validate_reasoning_replay
 
+from tests.eval.metrics import count_wire_reasoning
+
 
 def _reasoning(text: str) -> Message:
     return Message(MessageRole.ASSISTANT, text, MessageMeta(MessageType.REASONING))
@@ -134,3 +136,33 @@ def test_prepare_backend_messages_folds_forge_history_without_raw_messages():
     )
 
     assert [m["content"] for m in result] == ["", "second"]
+
+
+# ── Eval-side on-wire reasoning counter (validates the knob end to end) ──
+
+def _wire_transcript() -> list[Message]:
+    return [
+        _reasoning("first"), _tool_call("a"),
+        _reasoning("second"), _tool_call("b"),
+    ]
+
+
+def test_count_wire_reasoning_full_keeps_all():
+    survived, total = count_wire_reasoning(_wire_transcript(), "full")
+    assert (survived, total) == (2, 2)
+
+
+def test_count_wire_reasoning_keep_last_keeps_one():
+    survived, total = count_wire_reasoning(_wire_transcript(), "keep-last")
+    assert (survived, total) == (1, 2)
+
+
+def test_count_wire_reasoning_none_strips_all():
+    # The core claim: none puts zero reasoning on the wire.
+    survived, total = count_wire_reasoning(_wire_transcript(), "none")
+    assert (survived, total) == (0, 2)
+
+
+def test_count_wire_reasoning_no_reasoning_is_zero_zero():
+    survived, total = count_wire_reasoning([_tool_call("a"), _tool_call("b")], "full")
+    assert (survived, total) == (0, 0)

From 6f6b3460bbfd3244780080d92a044c3573773f82 Mon Sep 17 00:00:00 2001
From: Antoine Zambelli <antoine.zambelli@gmail.com>
Date: Thu, 4 Jun 2026 21:58:01 -0500
Subject: [PATCH 04/14] feat(eval): add Anthropic prompt caching + cache-aware
 cost

Cache the re-sent tool defs + system prompt on the Anthropic eval path so
the repeated input prefix bills at 0.1x (read) instead of full price every
turn. Billing-only: identical model behavior, accuracy, and iteration counts
(safe for cross-run comparability).

- AnthropicClient gains opt-in `prompt_caching` (default off, so the proxy
  verbatim path and existing request shape are untouched). When on, a static
  ephemeral breakpoint marks the tools + system prefix in the rebuild path.
- Static-only on purpose: a rolling conversation breakpoint is NOT placed.
  The default reasoning_replay="keep-last" re-serializes earlier tool-call
  messages each turn, which busts a rolling prefix cache (1.25x writes, no
  reads). The conversation prefix is only stable under none/full, and
  reasoning_replay is a measured variable we won't pin, so caching is confined
  to the always-stable tools+system region.
- TokenUsage carries cache_creation/cache_read counts (additive, defaults 0);
  captured in send() and send_stream(); accumulated through CountingClientWrapper
  and RunResult into the JSONL row.
- _compute_cost is cache-aware (write 1.25x, read 0.1x of input rate); applied
  at the row and both eval_runner cost summaries.
- Enabled by default for batch_eval sweeps; eval_runner gains --no-anthropic-cache
  for a cache-free cost-floor comparison.
- Bump claude-opus-4-6 -> claude-opus-4-8 (configs + pricing, $5/$25 verified).

Validated: 1148 unit tests pass (incl. new cache tests) + a live one-run smoke
on compaction_chain_baseline (20,523 cache reads, behavior unchanged).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 src/forge/clients/anthropic.py       | 52 ++++++++++++++++++
 src/forge/clients/base.py            |  9 ++++
 tests/eval/batch_eval.py             | 54 ++++++++++++++++---
 tests/eval/eval_runner.py            | 47 ++++++++++++++--
 tests/unit/test_anthropic_client.py  | 81 +++++++++++++++++++++++++++-
 tests/unit/test_batch_eval_resume.py | 50 +++++++++++++++++
 tests/unit/test_proxy_path1.py       |  2 +
 7 files changed, 282 insertions(+), 13 deletions(-)

diff --git a/src/forge/clients/anthropic.py b/src/forge/clients/anthropic.py
index 6437eb7..ddb1349 100644
--- a/src/forge/clients/anthropic.py
+++ b/src/forge/clients/anthropic.py
@@ -40,10 +40,17 @@ def __init__(
         tool_choice: str | None = None,
         recommended_sampling: bool = False,
         base_url: str | None = None,
+        prompt_caching: bool = False,
     ) -> None:
         self.model = model
         self.max_tokens = max_tokens
         self._tool_choice = tool_choice  # "auto", "any", or None (default=auto)
+        # Opt-in Anthropic prompt caching (billing-only). When on, the rebuild
+        # path marks a static cache breakpoint over the tool defs + system
+        # prompt (re-sent verbatim every turn). Off by default so the proxy
+        # verbatim path and existing request shape are untouched. See
+        # _apply_static_cache for why caching is static-only here.
+        self._prompt_caching = prompt_caching
         # Accepted for API symmetry across clients but currently a no-op:
         # AnthropicClient does not expose sampling kwargs through forge today.
         # The Anthropic SDK manages sampling internally.
@@ -279,8 +286,41 @@ def _build_kwargs(
             kwargs["tools"] = self._convert_tools(tools)
             if self._tool_choice and "tool_choice" not in kwargs:
                 kwargs["tool_choice"] = {"type": self._tool_choice}
+        if self._prompt_caching:
+            self._apply_static_cache(kwargs)
         return kwargs
 
+    @staticmethod
+    def _apply_static_cache(kwargs: dict[str, Any]) -> None:
+        """Mark a static ephemeral cache breakpoint over tool defs + system.
+
+        The tool block and system prompt are byte-identical on every turn of a
+        run, so this prefix reliably read-hits (at 0.1×) from turn 2 onward
+        instead of re-billing the re-sent schema + prompt at full price.
+
+        Static-only on purpose: a *rolling* per-turn breakpoint over the growing
+        conversation is NOT placed here. The eval's default
+        ``reasoning_replay="keep-last"`` re-serializes earlier tool-call messages
+        differently each turn (it keeps only the latest reasoning), which busts a
+        rolling prefix cache — you'd pay 1.25× writes with no reads. The
+        conversation prefix is only stable under ``none``/``full``, and
+        ``reasoning_replay`` is a measured variable we won't pin, so caching is
+        confined to the always-stable tools+system region.
+
+        The cached prefix is ordered tools → system → messages, so a single
+        breakpoint on the system block subsumes the tools; we additionally mark
+        the last tool so the tool prefix still caches when ``system`` is absent.
+        """
+        ephemeral = {"type": "ephemeral"}
+        tools = kwargs.get("tools")
+        if tools:
+            tools[-1]["cache_control"] = ephemeral
+        system = kwargs.get("system")
+        if isinstance(system, str) and system:
+            kwargs["system"] = [
+                {"type": "text", "text": system, "cache_control": ephemeral}
+            ]
+
     async def send(
         self,
         messages: list[dict[str, str]],
@@ -319,6 +359,12 @@ async def send(
                 prompt_tokens=response.usage.input_tokens,
                 completion_tokens=response.usage.output_tokens,
                 total_tokens=response.usage.input_tokens + response.usage.output_tokens,
+                cache_creation_input_tokens=getattr(
+                    response.usage, "cache_creation_input_tokens", 0
+                ) or 0,
+                cache_read_input_tokens=getattr(
+                    response.usage, "cache_read_input_tokens", 0
+                ) or 0,
             )
         }
         return self._parse_response(response)
@@ -402,6 +448,12 @@ async def send_stream(
                         completion_tokens=final_message.usage.output_tokens,
                         total_tokens=final_message.usage.input_tokens
                         + final_message.usage.output_tokens,
+                        cache_creation_input_tokens=getattr(
+                            final_message.usage, "cache_creation_input_tokens", 0
+                        ) or 0,
+                        cache_read_input_tokens=getattr(
+                            final_message.usage, "cache_read_input_tokens", 0
+                        ) or 0,
                     )
                 }
         except anthropic.APIError as exc:
diff --git a/src/forge/clients/base.py b/src/forge/clients/base.py
index 2a084a3..7aaa3d2 100644
--- a/src/forge/clients/base.py
+++ b/src/forge/clients/base.py
@@ -25,11 +25,20 @@ class TokenUsage:
     llama-server).  Backends that don't report usage leave the client's
     ``last_usage`` empty and the context manager falls back to heuristic
     estimation.
+
+    ``cache_creation_input_tokens`` / ``cache_read_input_tokens`` are
+    Anthropic prompt-cache counters (0 for backends without caching, or when
+    caching is off). ``prompt_tokens`` stays the *uncached* input sliver and
+    ``total_tokens`` stays ``prompt + completion`` — the cache counters are
+    carried separately so cost can price them (write 1.25×, read 0.1× of the
+    input rate) without shifting any existing consumer's semantics.
     """
 
     prompt_tokens: int
     completion_tokens: int
     total_tokens: int
+    cache_creation_input_tokens: int = 0
+    cache_read_input_tokens: int = 0
 
 
 # Both Ollama and llama-server use the OpenAI tool schema format today.
diff --git a/tests/eval/batch_eval.py b/tests/eval/batch_eval.py
index d588b25..3052bbc 100644
--- a/tests/eval/batch_eval.py
+++ b/tests/eval/batch_eval.py
@@ -156,13 +156,13 @@ class BatchConfig:
 ANTHROPIC_CONFIGS: list[BatchConfig] = [
     BatchConfig(model="claude-haiku-4-5-20251001", backend="anthropic", mode="native", think=None),
     BatchConfig(model="claude-sonnet-4-6", backend="anthropic", mode="native", think=None),
-    BatchConfig(model="claude-opus-4-6", backend="anthropic", mode="native", think=None),
+    BatchConfig(model="claude-opus-4-8", backend="anthropic", mode="native", think=None),
 ]
 
 ANTHROPIC_ANY_CONFIGS: list[BatchConfig] = [
     BatchConfig(model="claude-haiku-4-5-20251001", backend="anthropic", mode="native", think=None, tool_choice="any"),
     BatchConfig(model="claude-sonnet-4-6", backend="anthropic", mode="native", think=None, tool_choice="any"),
-    BatchConfig(model="claude-opus-4-6", backend="anthropic", mode="native", think=None, tool_choice="any"),
+    BatchConfig(model="claude-opus-4-8", backend="anthropic", mode="native", think=None, tool_choice="any"),
 ]
 
 ALL_CONFIGS: list[BatchConfig] = (
@@ -196,16 +196,40 @@ class BatchConfig:
     "claude-haiku-4-5-20251001": (1.0, 5.0),
     "claude-sonnet-4-6": (3.0, 15.0),
     "claude-opus-4-6": (5.0, 25.0),
+    # Opus 4.8 standard mode: $5 input / $25 output per Mtok (anthropic.com,
+    # confirmed 2026-06). Same as 4-6. (Fast mode is 2× — $10/$50 — not used here.)
+    "claude-opus-4-8": (5.0, 25.0),
 }
 
+# Prompt-cache token multipliers on the input rate, uniform across current
+# Anthropic models: writes bill 1.25×, reads bill 0.1×.
+_CACHE_WRITE_MULTIPLIER = 1.25
+_CACHE_READ_MULTIPLIER = 0.1
 
-def _compute_cost(model: str, input_tokens: int, output_tokens: int) -> float:
-    """Compute USD cost from token counts. Returns 0.0 for unknown models."""
+
+def _compute_cost(
+    model: str,
+    input_tokens: int,
+    output_tokens: int,
+    cache_creation_tokens: int = 0,
+    cache_read_tokens: int = 0,
+) -> float:
+    """Compute USD cost from token counts. Returns 0.0 for unknown models.
+
+    ``input_tokens`` is the *uncached* input sliver; cached writes/reads are
+    priced separately off the input rate so prompt caching is reflected
+    accurately (the API reports these as distinct usage fields).
+    """
     rates = _ANTHROPIC_PRICING.get(model)
     if not rates:
         return 0.0
     input_rate, output_rate = rates
-    return (input_tokens * input_rate + output_tokens * output_rate) / 1_000_000
+    return (
+        input_tokens * input_rate
+        + cache_creation_tokens * input_rate * _CACHE_WRITE_MULTIPLIER
+        + cache_read_tokens * input_rate * _CACHE_READ_MULTIPLIER
+        + output_tokens * output_rate
+    ) / 1_000_000
 
 
 # ── JSONL helpers ───────────────────────────────────────────────
@@ -342,11 +366,19 @@ def _run_result_to_row(
         row["wasted_calls"] = None
 
     # Token usage and cost (Anthropic only — local backends report 0)
-    if result.input_tokens or result.output_tokens:
+    if (
+        result.input_tokens or result.output_tokens
+        or result.cache_creation_tokens or result.cache_read_tokens
+    ):
         row["input_tokens"] = result.input_tokens
         row["output_tokens"] = result.output_tokens
+        row["cache_creation_input_tokens"] = result.cache_creation_tokens
+        row["cache_read_input_tokens"] = result.cache_read_tokens
         row["cost_usd"] = round(
-            _compute_cost(config.model, result.input_tokens, result.output_tokens),
+            _compute_cost(
+                config.model, result.input_tokens, result.output_tokens,
+                result.cache_creation_tokens, result.cache_read_tokens,
+            ),
             6,
         )
 
@@ -562,7 +594,13 @@ def _build_client(config: BatchConfig, models_dir: Path) -> Any:
     elif config.backend == "anthropic":
         from forge.clients.anthropic import AnthropicClient
 
-        return AnthropicClient(model=config.model, tool_choice=config.tool_choice)
+        # Prompt caching on for sweeps: billing-only (identical model behavior
+        # and accuracy/iterations metrics), caches the re-sent tool defs +
+        # system prompt. Static-only — see AnthropicClient._apply_static_cache.
+        return AnthropicClient(
+            model=config.model, tool_choice=config.tool_choice,
+            prompt_caching=True,
+        )
 
     else:
         raise ValueError(f"Unknown backend: {config.backend}")
diff --git a/tests/eval/eval_runner.py b/tests/eval/eval_runner.py
index 77707b9..505628a 100644
--- a/tests/eval/eval_runner.py
+++ b/tests/eval/eval_runner.py
@@ -49,6 +49,8 @@ class RunResult:
     stream_retries: int = 0
     input_tokens: int = 0
     output_tokens: int = 0
+    cache_creation_tokens: int = 0
+    cache_read_tokens: int = 0
     cost_usd: float = 0.0
 
 
@@ -75,6 +77,8 @@ def __init__(self, client: LLMClient) -> None:
         self.call_count = 0
         self.total_input_tokens = 0
         self.total_output_tokens = 0
+        self.total_cache_creation_tokens = 0
+        self.total_cache_read_tokens = 0
 
     def __getattr__(self, name: str) -> Any:
         return getattr(self._client, name)
@@ -88,6 +92,13 @@ def _collect_usage(self) -> None:
             for tu in usage.values():
                 self.total_input_tokens += tu.prompt_tokens
                 self.total_output_tokens += tu.completion_tokens
+                # Anthropic prompt-cache counters; 0 for other backends.
+                self.total_cache_creation_tokens += getattr(
+                    tu, "cache_creation_input_tokens", 0
+                )
+                self.total_cache_read_tokens += getattr(
+                    tu, "cache_read_input_tokens", 0
+                )
 
     async def send(
         self,
@@ -332,6 +343,8 @@ def on_message(msg: Message) -> None:
                 stream_retries=attempt,
                 input_tokens=counting_client.total_input_tokens,
                 output_tokens=counting_client.total_output_tokens,
+                cache_creation_tokens=counting_client.total_cache_creation_tokens,
+                cache_read_tokens=counting_client.total_cache_read_tokens,
             )
         except StreamError as exc:
             last_stream_error = exc
@@ -350,6 +363,8 @@ def on_message(msg: Message) -> None:
                 stream_retries=attempt,
                 input_tokens=counting_client.total_input_tokens,
                 output_tokens=counting_client.total_output_tokens,
+                cache_creation_tokens=counting_client.total_cache_creation_tokens,
+                cache_read_tokens=counting_client.total_cache_read_tokens,
             )
         except Exception as exc:
             elapsed = time.monotonic() - start
@@ -365,6 +380,8 @@ def on_message(msg: Message) -> None:
                 stream_retries=attempt,
                 input_tokens=counting_client.total_input_tokens,
                 output_tokens=counting_client.total_output_tokens,
+                cache_creation_tokens=counting_client.total_cache_creation_tokens,
+                cache_read_tokens=counting_client.total_cache_read_tokens,
             )
 
     # All stream retries exhausted
@@ -457,13 +474,15 @@ async def run_eval(
             else:
                 status = "OK"
             cost_str = ""
-            if result.input_tokens:
+            if result.input_tokens or result.cache_read_tokens or result.cache_creation_tokens:
                 from tests.eval.batch_eval import _compute_cost
 
                 cost = _compute_cost(
                     client.model if hasattr(client, "model") else "",
                     result.input_tokens,
                     result.output_tokens,
+                    result.cache_creation_tokens,
+                    result.cache_read_tokens,
                 )
                 if cost > 0:
                     cost_str = f", ${cost:.4f}"
@@ -570,6 +589,13 @@ async def main() -> None:
         action="store_true",
         help="Disable llama-server prompt caching (default: enabled)",
     )
+    parser.add_argument(
+        "--no-anthropic-cache",
+        action="store_true",
+        help="Disable Anthropic prompt caching for --backend anthropic "
+        "(default: enabled). Caching is billing-only; use this for a "
+        "cache-free cost-floor comparison.",
+    )
     parser.add_argument(
         "--compact-strategy",
         choices=["tiered", "sliding", "none"],
@@ -616,7 +642,10 @@ async def main() -> None:
     elif args.backend == "anthropic":
         from forge.clients.anthropic import AnthropicClient
 
-        client = AnthropicClient(model=args.model, tool_choice=args.tool_choice)
+        client = AnthropicClient(
+            model=args.model, tool_choice=args.tool_choice,
+            prompt_caching=not args.no_anthropic_cache,
+        )
     else:
         from forge.clients.llamafile import LlamafileClient
 
@@ -710,14 +739,24 @@ async def main() -> None:
     all_runs = [r for runs in results.values() for r in runs]
     total_input = sum(r.input_tokens for r in all_runs)
     total_output = sum(r.output_tokens for r in all_runs)
-    if total_input:
+    total_cache_creation = sum(r.cache_creation_tokens for r in all_runs)
+    total_cache_read = sum(r.cache_read_tokens for r in all_runs)
+    if total_input or total_cache_read or total_cache_creation:
         from tests.eval.batch_eval import _compute_cost
 
-        total_cost = _compute_cost(args.model, total_input, total_output)
+        total_cost = _compute_cost(
+            args.model, total_input, total_output,
+            total_cache_creation, total_cache_read,
+        )
         print(
             f"Token usage: {total_input:,} input + {total_output:,} output"
             f" = {total_input + total_output:,} total"
         )
+        if total_cache_creation or total_cache_read:
+            print(
+                f"Prompt cache: {total_cache_creation:,} written + "
+                f"{total_cache_read:,} read"
+            )
         if total_cost > 0:
             n_runs = len(all_runs)
             print(f"Total cost: ${total_cost:.4f} ({n_runs} runs, ${total_cost / n_runs:.4f}/run)")
diff --git a/tests/unit/test_anthropic_client.py b/tests/unit/test_anthropic_client.py
index d00b514..edb4548 100644
--- a/tests/unit/test_anthropic_client.py
+++ b/tests/unit/test_anthropic_client.py
@@ -6,7 +6,7 @@
 
 import json
 from typing import Literal
-from unittest.mock import MagicMock
+from unittest.mock import AsyncMock, MagicMock
 
 import pytest
 from pydantic import BaseModel, Field
@@ -428,6 +428,10 @@ async def test_send_records_slot_keyed_usage(self) -> None:
         response.content = [text_block]
         response.usage.input_tokens = 12
         response.usage.output_tokens = 7
+        # Real Anthropic Usage reports these as ints (0 without caching); set
+        # them so the MagicMock doesn't auto-create truthy attrs.
+        response.usage.cache_creation_input_tokens = 0
+        response.usage.cache_read_input_tokens = 0
 
         async def fake_create(**kwargs):
             return response
@@ -441,3 +445,78 @@ async def fake_create(**kwargs):
         assert client.last_usage == {0: expected}
         # Cross-client contract: _get_usage resolves slot 0 to the TokenUsage.
         assert _get_usage(client) == expected
+
+
+# ── Prompt caching (static tools+system breakpoint) ──────────────
+
+
+class TestPromptCaching:
+    """Opt-in prompt caching marks a static breakpoint over tool defs + system
+    in the rebuild path only; off by default; never touches the verbatim path."""
+
+    _MESSAGES = [
+        {"role": "system", "content": "stable system prompt"},
+        {"role": "user", "content": "hi"},
+    ]
+
+    def test_static_cache_marks_tools_and_system(self) -> None:
+        client = AnthropicClient(
+            model="claude-test", api_key="dummy", prompt_caching=True
+        )
+        tools = [_make_spec("a"), _make_spec("b")]
+        kwargs = client._build_kwargs(self._MESSAGES, tools)
+
+        # Last tool carries the ephemeral breakpoint (caches the tool prefix).
+        assert kwargs["tools"][-1]["cache_control"] == {"type": "ephemeral"}
+        # System is converted to a cached text block (caches tools+system).
+        assert isinstance(kwargs["system"], list)
+        assert kwargs["system"][0]["text"] == "stable system prompt"
+        assert kwargs["system"][0]["cache_control"] == {"type": "ephemeral"}
+
+    def test_no_cache_control_by_default(self) -> None:
+        client = AnthropicClient(model="claude-test", api_key="dummy")
+        tools = [_make_spec("a"), _make_spec("b")]
+        kwargs = client._build_kwargs(self._MESSAGES, tools)
+
+        assert "cache_control" not in kwargs["tools"][-1]
+        # System stays a plain string when caching is off.
+        assert kwargs["system"] == "stable system prompt"
+
+    def test_cache_does_not_touch_verbatim_inbound(self) -> None:
+        """prompt_caching must not mutate the path-1 verbatim body — that path
+        carries the proxy's own cache_control and bypasses the rebuild."""
+        client = AnthropicClient(
+            model="claude-test", api_key="dummy", prompt_caching=True
+        )
+        inbound = {
+            "max_tokens": 10,
+            "system": "verbatim system",
+            "messages": [{"role": "user", "content": "hi"}],
+        }
+        kwargs = client._build_kwargs([], None, None, inbound)
+
+        # System stays the verbatim string (NOT converted to a cached block).
+        assert kwargs["system"] == "verbatim system"
+
+    @pytest.mark.asyncio
+    async def test_send_records_cache_usage(self) -> None:
+        client = AnthropicClient(
+            model="claude-test", api_key="dummy", prompt_caching=True
+        )
+        text_block = MagicMock()
+        text_block.type = "text"
+        text_block.text = "ok"
+        response = MagicMock()
+        response.content = [text_block]
+        response.usage.input_tokens = 5
+        response.usage.output_tokens = 3
+        response.usage.cache_creation_input_tokens = 100
+        response.usage.cache_read_input_tokens = 200
+        client._client.messages.create = AsyncMock(return_value=response)
+
+        await client.send([{"role": "user", "content": "hi"}])
+
+        tu = client.last_usage[0]
+        assert tu.prompt_tokens == 5
+        assert tu.cache_creation_input_tokens == 100
+        assert tu.cache_read_input_tokens == 200
diff --git a/tests/unit/test_batch_eval_resume.py b/tests/unit/test_batch_eval_resume.py
index 8ce3f73..8becd20 100644
--- a/tests/unit/test_batch_eval_resume.py
+++ b/tests/unit/test_batch_eval_resume.py
@@ -14,6 +14,7 @@
 
 from tests.eval.batch_eval import (
     BatchConfig,
+    _compute_cost,
     _count_completed_runs,
     _run_key,
     _run_result_to_row,
@@ -91,3 +92,52 @@ def key(rr: str) -> str:
     # explicit keep-last + the legacy row defaulting to keep-last
     assert counts[key("keep-last")] == 2
     assert counts[key("none")] + counts[key("full")] + counts[key("keep-last")] == 5
+
+
+def test_compute_cost_prices_cache_tokens() -> None:
+    """Cache writes bill 1.25× and reads 0.1× of the input rate; uncached input
+    and output keep their base rates. (sonnet: $3 input / $15 output per Mtok.)"""
+    cost = _compute_cost(
+        "claude-sonnet-4-6",
+        input_tokens=1_000,
+        output_tokens=500,
+        cache_creation_tokens=2_000,
+        cache_read_tokens=4_000,
+    )
+    expected = (
+        1_000 * 3.0
+        + 2_000 * 3.0 * 1.25
+        + 4_000 * 3.0 * 0.1
+        + 500 * 15.0
+    ) / 1_000_000
+    assert cost == expected
+
+    # Back-compat: omitting cache args matches the old input+output formula.
+    assert _compute_cost("claude-sonnet-4-6", 1_000, 500) == (
+        1_000 * 3.0 + 500 * 15.0
+    ) / 1_000_000
+
+    # Opus 4.8 is priced (placeholder rate), not an unknown-model 0.0.
+    assert _compute_cost("claude-opus-4-8", 1_000, 0) > 0
+
+
+def test_run_result_to_row_emits_cache_tokens() -> None:
+    cfg = BatchConfig(model="claude-sonnet-4-6", backend="anthropic", mode="native", think=None)
+    res = RunResult(
+        scenario_name="sc",
+        completeness=True,
+        iterations_used=3,
+        accuracy=True,
+        messages=None,
+        input_tokens=1_000,
+        output_tokens=500,
+        cache_creation_tokens=2_000,
+        cache_read_tokens=4_000,
+    )
+    row = _run_result_to_row(res, cfg, basic_2step, run_idx=1)
+
+    assert row["cache_creation_input_tokens"] == 2_000
+    assert row["cache_read_input_tokens"] == 4_000
+    assert row["cost_usd"] == round(
+        _compute_cost("claude-sonnet-4-6", 1_000, 500, 2_000, 4_000), 6
+    )
diff --git a/tests/unit/test_proxy_path1.py b/tests/unit/test_proxy_path1.py
index 4cc8862..49ecc59 100644
--- a/tests/unit/test_proxy_path1.py
+++ b/tests/unit/test_proxy_path1.py
@@ -183,6 +183,8 @@ def _stub_anthropic_response():
     msg.content = [MagicMock(type="text", text="ok")]
     msg.usage.input_tokens = 1
     msg.usage.output_tokens = 1
+    msg.usage.cache_creation_input_tokens = 0
+    msg.usage.cache_read_input_tokens = 0
     return msg
 
 

From b1217c7174048a4bf7ab9fc817c67f293bf9ddb4 Mon Sep 17 00:00:00 2001
From: Antoine Zambelli <antoine.zambelli@gmail.com>
Date: Sat, 6 Jun 2026 05:44:17 -0500
Subject: [PATCH 05/14] Add v0.7.5 reasoning-replay eval results (rig-02
 dual-GPU sweep, 67.6k runs)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 eval_results_v0.7.5.jsonl | 3 +++
 1 file changed, 3 insertions(+)
 create mode 100644 eval_results_v0.7.5.jsonl

diff --git a/eval_results_v0.7.5.jsonl b/eval_results_v0.7.5.jsonl
new file mode 100644
index 0000000..d185976
--- /dev/null
+++ b/eval_results_v0.7.5.jsonl
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:598db5bfeddbdf7c5e74bd59bdff821b462ce3e396f525cc860dd6c3fba51c0c
+size 41620228

From e450b43be05af35a4f5097c4e549b8df8d34eab0 Mon Sep 17 00:00:00 2001
From: Antoine Zambelli <antoine.zambelli@gmail.com>
Date: Sat, 6 Jun 2026 06:02:39 -0500
Subject: [PATCH 06/14] Complete v0.7.5 reasoning-replay eval grid (78 cells,
 101.4k runs)

Adds the remaining single-GPU sweep partition (33.8k runs, 26 config
cells) to the existing dual-GPU results, completing the full
reasoning_replay grid across all 14 models x {none,keep-last,full} x
{bare,reforged} x {native,prompt}. All 78 cells verified complete
(26 scenarios x 50 runs each), zero duplicate run-keys.

Rows stamped gen=3 (v0.6.0=1, v0.7.0=2) so cross-generation report
dedup keeps this suite over older generations of the same config.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 eval_results_v0.7.5.jsonl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/eval_results_v0.7.5.jsonl b/eval_results_v0.7.5.jsonl
index d185976..fd28663 100644
--- a/eval_results_v0.7.5.jsonl
+++ b/eval_results_v0.7.5.jsonl
@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:598db5bfeddbdf7c5e74bd59bdff821b462ce3e396f525cc860dd6c3fba51c0c
-size 41620228
+oid sha256:a677cffdb3f1f018fe2b61bf3fe37ccc99e7193dddb78f603e0ff52d9df7e6da
+size 63568949

From 694110bc3b847789bd2a24f6bc6ca582ca9586a0 Mon Sep 17 00:00:00 2001
From: Antoine Zambelli <antoine.zambelli@gmail.com>
Date: Sun, 7 Jun 2026 19:36:56 -0500
Subject: [PATCH 07/14] Add Anthropic v0.7.5 eval rows

---
 eval_results_v0.7.5.jsonl            |  4 +--
 src/forge/clients/anthropic.py       | 13 +++++++-
 tests/eval/batch_eval.py             | 21 ++++++++++---
 tests/unit/test_anthropic_client.py  | 39 +++++++++++++++++++++++
 tests/unit/test_batch_eval_resume.py | 46 ++++++++++++++++++++++++++++
 5 files changed, 116 insertions(+), 7 deletions(-)

diff --git a/eval_results_v0.7.5.jsonl b/eval_results_v0.7.5.jsonl
index fd28663..9c6a070 100644
--- a/eval_results_v0.7.5.jsonl
+++ b/eval_results_v0.7.5.jsonl
@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a677cffdb3f1f018fe2b61bf3fe37ccc99e7193dddb78f603e0ff52d9df7e6da
-size 63568949
+oid sha256:be698e8707360458c0e2310516149eed2e295c186bb10b847424a5f554418ab5
+size 70832005
diff --git a/src/forge/clients/anthropic.py b/src/forge/clients/anthropic.py
index ddb1349..7cea440 100644
--- a/src/forge/clients/anthropic.py
+++ b/src/forge/clients/anthropic.py
@@ -41,6 +41,7 @@ def __init__(
         recommended_sampling: bool = False,
         base_url: str | None = None,
         prompt_caching: bool = False,
+        thinking: dict[str, Any] | None = None,
     ) -> None:
         self.model = model
         self.max_tokens = max_tokens
@@ -51,6 +52,12 @@ def __init__(
         # verbatim path and existing request shape are untouched. See
         # _apply_static_cache for why caching is static-only here.
         self._prompt_caching = prompt_caching
+        # Extended-thinking request config, e.g. {"type": "adaptive"}. When set,
+        # merged into every messages.create call (and a forced tool_choice is
+        # suppressed — Anthropic requires tool_choice="auto" with thinking on).
+        # None = thinking off; the proxy passthrough path can still carry its
+        # own ``thinking`` via ``passthrough``.
+        self._thinking = thinking
         # Accepted for API symmetry across clients but currently a no-op:
         # AnthropicClient does not expose sampling kwargs through forge today.
         # The Anthropic SDK manages sampling internally.
@@ -284,8 +291,12 @@ def _build_kwargs(
             kwargs["system"] = system
         if tools:
             kwargs["tools"] = self._convert_tools(tools)
-            if self._tool_choice and "tool_choice" not in kwargs:
+            # Extended thinking is incompatible with a forced tool_choice;
+            # Anthropic requires "auto" (the default) when thinking is on.
+            if self._tool_choice and not self._thinking and "tool_choice" not in kwargs:
                 kwargs["tool_choice"] = {"type": self._tool_choice}
+        if self._thinking and "thinking" not in kwargs:
+            kwargs["thinking"] = self._thinking
         if self._prompt_caching:
             self._apply_static_cache(kwargs)
         return kwargs
diff --git a/tests/eval/batch_eval.py b/tests/eval/batch_eval.py
index 3052bbc..25ee699 100644
--- a/tests/eval/batch_eval.py
+++ b/tests/eval/batch_eval.py
@@ -154,9 +154,13 @@ class BatchConfig:
 ]
 
 ANTHROPIC_CONFIGS: list[BatchConfig] = [
-    BatchConfig(model="claude-haiku-4-5-20251001", backend="anthropic", mode="native", think=None),
-    BatchConfig(model="claude-sonnet-4-6", backend="anthropic", mode="native", think=None),
-    BatchConfig(model="claude-opus-4-8", backend="anthropic", mode="native", think=None),
+    # think=True -> adaptive extended thinking ("Claude with reasoning" baseline
+    # rows). Haiku has no adaptive support (API rejects it) so it stays a
+    # non-thinking baseline. Wired in _build_client. NOT part of the
+    # reasoning_replay sweep — thinking here is request-only, no replay folding.
+    BatchConfig(model="claude-haiku-4-5-20251001", backend="anthropic", mode="native", think=False),
+    BatchConfig(model="claude-sonnet-4-6", backend="anthropic", mode="native", think=True),
+    BatchConfig(model="claude-opus-4-8", backend="anthropic", mode="native", think=True),
 ]
 
 ANTHROPIC_ANY_CONFIGS: list[BatchConfig] = [
@@ -597,9 +601,17 @@ def _build_client(config: BatchConfig, models_dir: Path) -> Any:
         # Prompt caching on for sweeps: billing-only (identical model behavior
         # and accuracy/iterations metrics), caches the re-sent tool defs +
         # system prompt. Static-only — see AnthropicClient._apply_static_cache.
+        #
+        # Adaptive extended thinking when think=True ("Claude with reasoning"
+        # baselines). Gated off for tool_choice="any" (forced tool choice is
+        # incompatible with thinking) and for models without adaptive support
+        # (Haiku, configured think=False). Request-only: no reasoning_replay
+        # folding — these are baseline rows, not part of the replay sweep.
+        thinking = {"type": "adaptive"} if (config.think and config.tool_choice != "any") else None
         return AnthropicClient(
             model=config.model, tool_choice=config.tool_choice,
-            prompt_caching=True,
+            prompt_caching=True, thinking=thinking,
+            max_tokens=16384 if thinking else 4096,
         )
 
     else:
@@ -794,6 +806,7 @@ async def run_batch(
                             result, config, scenario, run_idx + 1,
                             budget_tokens=scenario_budget,
                             ablation_name=ablation_name,
+                            reasoning_replay=reasoning_replay,
                         )
                         with output_path.open("a") as f:
                             f.write(json.dumps(row) + "\n")
diff --git a/tests/unit/test_anthropic_client.py b/tests/unit/test_anthropic_client.py
index edb4548..bfc580b 100644
--- a/tests/unit/test_anthropic_client.py
+++ b/tests/unit/test_anthropic_client.py
@@ -520,3 +520,42 @@ async def test_send_records_cache_usage(self) -> None:
         assert tu.prompt_tokens == 5
         assert tu.cache_creation_input_tokens == 100
         assert tu.cache_read_input_tokens == 200
+
+
+class TestThinking:
+    """Adaptive extended-thinking request wiring (baseline rows). Request-only:
+    thinking is merged into the rebuild path and forces tool_choice=auto."""
+
+    _MESSAGES = [
+        {"role": "system", "content": "sys"},
+        {"role": "user", "content": "hi"},
+    ]
+
+    def test_thinking_merged_into_kwargs(self) -> None:
+        client = AnthropicClient(
+            model="claude-test", api_key="dummy", thinking={"type": "adaptive"}
+        )
+        kwargs = client._build_kwargs(self._MESSAGES, [_make_spec("a")])
+        assert kwargs["thinking"] == {"type": "adaptive"}
+
+    def test_no_thinking_by_default(self) -> None:
+        client = AnthropicClient(model="claude-test", api_key="dummy")
+        kwargs = client._build_kwargs(self._MESSAGES, [_make_spec("a")])
+        assert "thinking" not in kwargs
+
+    def test_thinking_suppresses_forced_tool_choice(self) -> None:
+        # Anthropic forbids a forced tool_choice with thinking on -> must drop it.
+        client = AnthropicClient(
+            model="claude-test", api_key="dummy",
+            tool_choice="any", thinking={"type": "adaptive"},
+        )
+        kwargs = client._build_kwargs(self._MESSAGES, [_make_spec("a")])
+        assert "tool_choice" not in kwargs
+        assert kwargs["thinking"] == {"type": "adaptive"}
+
+    def test_forced_tool_choice_kept_when_no_thinking(self) -> None:
+        client = AnthropicClient(
+            model="claude-test", api_key="dummy", tool_choice="any"
+        )
+        kwargs = client._build_kwargs(self._MESSAGES, [_make_spec("a")])
+        assert kwargs["tool_choice"] == {"type": "any"}
diff --git a/tests/unit/test_batch_eval_resume.py b/tests/unit/test_batch_eval_resume.py
index 8becd20..937c666 100644
--- a/tests/unit/test_batch_eval_resume.py
+++ b/tests/unit/test_batch_eval_resume.py
@@ -10,14 +10,18 @@
 
 import json
 
+import pytest
+
 from forge.core.reasoning import DEFAULT_REASONING_REPLAY
 
+import tests.eval.batch_eval as batch_eval
 from tests.eval.batch_eval import (
     BatchConfig,
     _compute_cost,
     _count_completed_runs,
     _run_key,
     _run_result_to_row,
+    run_batch,
 )
 from tests.eval.eval_runner import RunResult
 from tests.eval.scenarios import basic_2step
@@ -66,6 +70,48 @@ def test_run_result_to_row_records_reasoning_replay() -> None:
     assert default_row["reasoning_replay"] == DEFAULT_REASONING_REPLAY
 
 
+@pytest.mark.asyncio
+async def test_anthropic_batch_rows_record_selected_reasoning_replay(
+    tmp_path, monkeypatch,
+) -> None:
+    """Anthropic rows must use the runtime policy, not the module default."""
+    cfg = BatchConfig(
+        model="claude-sonnet-4-6",
+        backend="anthropic",
+        mode="native",
+        think=True,
+    )
+    output = tmp_path / "results.jsonl"
+
+    monkeypatch.setattr(batch_eval, "ALL_SCENARIOS", [basic_2step])
+    monkeypatch.setattr(batch_eval, "_build_client", lambda config, models_dir: object())
+
+    async def fake_run_with_timeout(client, scenario, eval_config, ablation):
+        assert eval_config.reasoning_replay == "none"
+        return RunResult(
+            scenario_name=scenario.name,
+            completeness=True,
+            iterations_used=3,
+            accuracy=True,
+            messages=None,
+        )
+
+    monkeypatch.setattr(batch_eval, "_run_with_timeout", fake_run_with_timeout)
+
+    await run_batch(
+        configs=[cfg],
+        runs_per_scenario=1,
+        output_path=output,
+        tags=["plumbing"],
+        reasoning_replay="none",
+    )
+
+    row = json.loads(output.read_text().strip())
+    assert row["model"] == "claude-sonnet-4-6"
+    assert row["backend"] == "anthropic"
+    assert row["reasoning_replay"] == "none"
+
+
 def test_count_completed_runs_separates_policies(tmp_path) -> None:
     rows = [
         _row("M", "sc", "none"),

From 2268c99122c51a52021ce77ea458e8c5ed75e6d5 Mon Sep 17 00:00:00 2001
From: Antoine Zambelli <antoine.zambelli@gmail.com>
Date: Thu, 11 Jun 2026 17:06:07 -0500
Subject: [PATCH 08/14] Add GPU-A catch-up replay eval shard

---
 eval_results_catchup_gpuA_v0.7.5.jsonl | 3 +++
 1 file changed, 3 insertions(+)
 create mode 100644 eval_results_catchup_gpuA_v0.7.5.jsonl

diff --git a/eval_results_catchup_gpuA_v0.7.5.jsonl b/eval_results_catchup_gpuA_v0.7.5.jsonl
new file mode 100644
index 0000000..3a92c15
--- /dev/null
+++ b/eval_results_catchup_gpuA_v0.7.5.jsonl
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:73a07bd8916acae8a7ec2fcb7c79a8e6d8f05e2bcfa9db6aa894c5c0ab020420
+size 12170481

From ab6f5e5550b3923e3577da73f8ba89af8c2fb4d2 Mon Sep 17 00:00:00 2001
From: Antoine Zambelli <antoine.zambelli@gmail.com>
Date: Thu, 11 Jun 2026 18:41:24 -0500
Subject: [PATCH 09/14] eval: merge reasoning replay catch-up results

Merge the catch-up reasoning-replay eval rows into the canonical v0.7.5 dataset and remove the temporary GPU-A shard. Add FORGE_EVAL_PORT so concurrent local eval workers can use separate llama-server ports.
---
 eval_results_catchup_gpuA_v0.7.5.jsonl |  3 ---
 eval_results_v0.7.5.jsonl              |  4 ++--
 tests/eval/batch_eval.py               | 13 ++++++++++---
 3 files changed, 12 insertions(+), 8 deletions(-)
 delete mode 100644 eval_results_catchup_gpuA_v0.7.5.jsonl

diff --git a/eval_results_catchup_gpuA_v0.7.5.jsonl b/eval_results_catchup_gpuA_v0.7.5.jsonl
deleted file mode 100644
index 3a92c15..0000000
--- a/eval_results_catchup_gpuA_v0.7.5.jsonl
+++ /dev/null
@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:73a07bd8916acae8a7ec2fcb7c79a8e6d8f05e2bcfa9db6aa894c5c0ab020420
-size 12170481
diff --git a/eval_results_v0.7.5.jsonl b/eval_results_v0.7.5.jsonl
index 9c6a070..9182af5 100644
--- a/eval_results_v0.7.5.jsonl
+++ b/eval_results_v0.7.5.jsonl
@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:be698e8707360458c0e2310516149eed2e295c186bb10b847424a5f554418ab5
-size 70832005
+oid sha256:8d0feb4f9f4456ba7b45cdb9a9fe407f593abecf7293c40fc40d30a12ebccfba
+size 106268073
diff --git a/tests/eval/batch_eval.py b/tests/eval/batch_eval.py
index 25ee699..98e295f 100644
--- a/tests/eval/batch_eval.py
+++ b/tests/eval/batch_eval.py
@@ -12,6 +12,7 @@
 
 import asyncio
 import json
+import os
 import subprocess
 import sys
 import time
@@ -31,6 +32,11 @@
 
 MODELS_DIR_DEFAULT = Path("models")
 
+
+def _eval_port() -> int:
+    """llama-server port for eval workers; overridden by rig wrappers."""
+    return int(os.environ.get("FORGE_EVAL_PORT", "8080"))
+
 # GGUF and llamafile model files for local-server backends.
 # Each entry is just the filename — paired into a BatchConfig below
 # alongside the canonical identity (the file stem, no extension).
@@ -580,6 +586,7 @@ def _build_client(config: BatchConfig, models_dir: Path) -> Any:
         return LlamafileClient(
             gguf_path=str(models_dir / config.gguf_filename),
             mode=config.mode, think=think_val,
+            base_url=f"http://localhost:{_eval_port()}/v1",
             recommended_sampling=recommended_sampling,
         )
 
@@ -591,7 +598,7 @@ def _build_client(config: BatchConfig, models_dir: Path) -> Any:
             gguf_path=str(models_dir / config.gguf_filename),
             mode=config.mode,
             think=think_val,
-            base_url="http://localhost:8080/v1",
+            base_url=f"http://localhost:{_eval_port()}/v1",
             recommended_sampling=recommended_sampling,
         )
 
@@ -703,7 +710,7 @@ async def run_batch(
     total_ran = 0
     total_failed_connect = 0
     batch_start = time.monotonic()
-    server = ServerManager(backend="ollama", port=8080, models_dir=models_dir)
+    server = ServerManager(backend="ollama", port=_eval_port(), models_dir=models_dir)
     prev_backend: str | None = None
     prev_server: ServerManager | None = None
 
@@ -846,7 +853,7 @@ async def run_batch(
                 if prev_server is not None and prev_backend != "ollama":
                     await prev_server.stop()
                 server = ServerManager(
-                    backend=config.backend, port=8080, models_dir=models_dir
+                    backend=config.backend, port=_eval_port(), models_dir=models_dir
                 )
 
             # Resolve GGUF/llamafile path for non-Ollama backends

From 870f2dbfdeda52505f5cb072735ac770a0c0d66d Mon Sep 17 00:00:00 2001
From: Antoine Zambelli <antoine.zambelli@gmail.com>
Date: Thu, 11 Jun 2026 20:39:48 -0500
Subject: [PATCH 10/14] feat(reasoning): default reasoning_replay to none

The v0.7.5 eval grid showed dropping replayed reasoning is statistically
indistinguishable from replay-all on score while saving the replayed
tokens every turn, so the bounded policy becomes the default. Help
strings, the resume-fold docstring, and the anthropic prompt-caching
rationale updated to match; default-behavior tests now assert omission,
with fold/exposure mechanics re-pinned under explicit keep-last.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---
 src/forge/clients/anthropic.py       | 14 +++++++-------
 src/forge/core/reasoning.py          |  2 +-
 src/forge/proxy/__main__.py          |  2 +-
 tests/eval/batch_eval.py             |  4 ++--
 tests/eval/eval_runner.py            |  2 +-
 tests/unit/test_batch_eval_resume.py |  6 +++---
 tests/unit/test_proxy_convert.py     | 21 +++++++++++++++++++--
 tests/unit/test_proxy_handler.py     | 17 +++++++++++++++++
 tests/unit/test_runner.py            | 26 +++++++++++++++++++++++---
 9 files changed, 74 insertions(+), 20 deletions(-)

diff --git a/src/forge/clients/anthropic.py b/src/forge/clients/anthropic.py
index 7cea440..6e57286 100644
--- a/src/forge/clients/anthropic.py
+++ b/src/forge/clients/anthropic.py
@@ -310,13 +310,13 @@ def _apply_static_cache(kwargs: dict[str, Any]) -> None:
         instead of re-billing the re-sent schema + prompt at full price.
 
         Static-only on purpose: a *rolling* per-turn breakpoint over the growing
-        conversation is NOT placed here. The eval's default
-        ``reasoning_replay="keep-last"`` re-serializes earlier tool-call messages
-        differently each turn (it keeps only the latest reasoning), which busts a
-        rolling prefix cache — you'd pay 1.25× writes with no reads. The
-        conversation prefix is only stable under ``none``/``full``, and
-        ``reasoning_replay`` is a measured variable we won't pin, so caching is
-        confined to the always-stable tools+system region.
+        conversation is NOT placed here. Under ``reasoning_replay="keep-last"``
+        earlier tool-call messages re-serialize differently each turn (only the
+        latest reasoning is kept), which busts a rolling prefix cache — you'd
+        pay 1.25× writes with no reads. The conversation prefix is only stable
+        under ``none``/``full``, and ``reasoning_replay`` is a measured eval
+        variable we won't pin, so caching is confined to the always-stable
+        tools+system region.
 
         The cached prefix is ordered tools → system → messages, so a single
         breakpoint on the system block subsumes the tools; we additionally mark
diff --git a/src/forge/core/reasoning.py b/src/forge/core/reasoning.py
index df0acec..0750598 100644
--- a/src/forge/core/reasoning.py
+++ b/src/forge/core/reasoning.py
@@ -8,7 +8,7 @@
 
 ReasoningReplay = Literal["full", "keep-last", "none"]
 REASONING_REPLAY_CHOICES: tuple[ReasoningReplay, ...] = ("full", "keep-last", "none")
-DEFAULT_REASONING_REPLAY: ReasoningReplay = "keep-last"
+DEFAULT_REASONING_REPLAY: ReasoningReplay = "none"
 
 
 def validate_reasoning_replay(value: str) -> ReasoningReplay:
diff --git a/src/forge/proxy/__main__.py b/src/forge/proxy/__main__.py
index d29f61a..455defa 100644
--- a/src/forge/proxy/__main__.py
+++ b/src/forge/proxy/__main__.py
@@ -91,7 +91,7 @@ def main() -> None:
         choices=REASONING_REPLAY_CHOICES,
         default=DEFAULT_REASONING_REPLAY,
         help="How much captured reasoning to replay to the backend "
-             "(default: keep-last).",
+             "(default: none).",
     )
     parser.add_argument("--verbose", "-v", action="store_true", help="Verbose logging")
 
diff --git a/tests/eval/batch_eval.py b/tests/eval/batch_eval.py
index 98e295f..6a5d7e7 100644
--- a/tests/eval/batch_eval.py
+++ b/tests/eval/batch_eval.py
@@ -280,7 +280,7 @@ def _count_completed_runs(
 
     Returns dict mapping the canonical run key → count. Records without an
     ablation field are treated as "reforged", without tool_choice as "auto",
-    and without reasoning_replay as the default policy (keep-last) — so
+    and without reasoning_replay as the default policy (none) — so
     pre-knob dumps resume cleanly under the default and are re-run under a
     different policy.
     """
@@ -1074,7 +1074,7 @@ async def main() -> None:
         choices=list(REASONING_REPLAY_CHOICES),
         default=DEFAULT_REASONING_REPLAY,
         help="How much captured reasoning to replay to the backend each turn: "
-        "full (legacy), keep-last (default), none. Part of the resume key, so "
+        "full (legacy), keep-last, none (default). Part of the resume key, so "
         "distinct policies for the same model/scenario are independent runs.",
     )
     parser.add_argument(
diff --git a/tests/eval/eval_runner.py b/tests/eval/eval_runner.py
index 505628a..94cb3ac 100644
--- a/tests/eval/eval_runner.py
+++ b/tests/eval/eval_runner.py
@@ -576,7 +576,7 @@ async def main() -> None:
         choices=list(REASONING_REPLAY_CHOICES),
         default=DEFAULT_REASONING_REPLAY,
         help="How much captured reasoning to replay to the backend each turn: "
-        "full (legacy: replay all), keep-last (default: only most recent), none (drop all).",
+        "full (legacy: replay all), keep-last (only most recent), none (default: drop all).",
     )
     parser.add_argument(
         "--tool-choice",
diff --git a/tests/unit/test_batch_eval_resume.py b/tests/unit/test_batch_eval_resume.py
index 937c666..a32cab9 100644
--- a/tests/unit/test_batch_eval_resume.py
+++ b/tests/unit/test_batch_eval_resume.py
@@ -133,10 +133,10 @@ def test_count_completed_runs_separates_policies(tmp_path) -> None:
     def key(rr: str) -> str:
         return _run_key("M", "llamaserver", "native", "reforged", "auto", rr, "sc")
 
-    assert counts[key("none")] == 2
+    # explicit none ×2 + the legacy row defaulting to none
+    assert counts[key("none")] == 3
     assert counts[key("full")] == 1
-    # explicit keep-last + the legacy row defaulting to keep-last
-    assert counts[key("keep-last")] == 2
+    assert counts[key("keep-last")] == 1
     assert counts[key("none")] + counts[key("full")] + counts[key("keep-last")] == 5
 
 
diff --git a/tests/unit/test_proxy_convert.py b/tests/unit/test_proxy_convert.py
index 11b5a2e..07be9bb 100644
--- a/tests/unit/test_proxy_convert.py
+++ b/tests/unit/test_proxy_convert.py
@@ -149,12 +149,21 @@ def test_multiple_tool_calls(self):
         ])
         assert len(result["choices"][0]["message"]["tool_calls"]) == 2
 
-    def test_reasoning_default_exposed_as_reasoning_content(self):
+    def test_reasoning_omitted_by_default(self):
+        # Default policy is "none": reasoning is not exposed on the response.
         result = tool_calls_to_openai([
             ToolCall(tool="search", args={}, reasoning="Let me think..."),
         ])
         msg = result["choices"][0]["message"]
         assert msg["content"] is None
+        assert "reasoning_content" not in msg
+
+    def test_keep_last_reasoning_replay_exposed_as_reasoning_content(self):
+        result = tool_calls_to_openai([
+            ToolCall(tool="search", args={}, reasoning="Let me think..."),
+        ], reasoning_replay="keep-last")
+        msg = result["choices"][0]["message"]
+        assert msg["content"] is None
         assert msg["reasoning_content"] == "Let me think..."
 
     def test_full_reasoning_replay_exposes_reasoning_in_content(self):
@@ -216,10 +225,18 @@ def test_single_tool_call_structure(self):
         assert events[-1]["choices"][0]["finish_reason"] == "tool_calls"
         assert events[-1]["choices"][0]["delta"] == {}
 
-    def test_reasoning_prepended_as_reasoning_content_by_default(self):
+    def test_reasoning_omitted_from_stream_by_default(self):
+        # Default policy is "none": no reasoning delta is streamed.
         events = tool_calls_to_sse_events([
             ToolCall(tool="search", args={}, reasoning="Thinking..."),
         ])
+        assert len(events) == 2
+        assert "tool_calls" in events[0]["choices"][0]["delta"]
+
+    def test_keep_last_reasoning_replay_streams_reasoning_content_delta(self):
+        events = tool_calls_to_sse_events([
+            ToolCall(tool="search", args={}, reasoning="Thinking..."),
+        ], reasoning_replay="keep-last")
         # reasoning delta + tool call delta + final
         assert len(events) == 3
         assert events[0]["choices"][0]["delta"]["reasoning_content"] == "Thinking..."
diff --git a/tests/unit/test_proxy_handler.py b/tests/unit/test_proxy_handler.py
index c4689c4..8b42f18 100644
--- a/tests/unit/test_proxy_handler.py
+++ b/tests/unit/test_proxy_handler.py
@@ -520,11 +520,28 @@ async def test_default_reasoning_replay_filters_raw_reasoning_only(self):
             _body(messages=messages, tools=[_tool_def("search")]),
             client, _context_manager(),
         )
+        # Default policy is "none": every reasoning field is stripped, but the
+        # rest of each raw message survives verbatim.
         sent_messages = client.send.call_args.args[0]
         assert sent_messages[0]["name"] == "a1"
         assert sent_messages[0]["vendor"] == {"kept": True}
         assert "reasoning_content" not in sent_messages[0]
         assert sent_messages[1]["name"] == "a2"
+        assert "reasoning_content" not in sent_messages[1]
+
+    @pytest.mark.asyncio
+    async def test_keep_last_reasoning_replay_keeps_latest_only(self):
+        client = _mock_client([ToolCall(tool="search", args={"q": "x"})])
+        messages = [
+            {"role": "assistant", "content": None, "reasoning_content": "old", "tool_calls": [], "name": "a1"},
+            {"role": "assistant", "content": None, "reasoning_content": "latest", "tool_calls": [], "name": "a2"},
+        ]
+        await handle_chat_completions(
+            _body(messages=messages, tools=[_tool_def("search")]),
+            client, _context_manager(), reasoning_replay="keep-last",
+        )
+        sent_messages = client.send.call_args.args[0]
+        assert "reasoning_content" not in sent_messages[0]
         assert sent_messages[1]["reasoning_content"] == "latest"
 
     @pytest.mark.asyncio
diff --git a/tests/unit/test_runner.py b/tests/unit/test_runner.py
index ac69900..9a33c07 100644
--- a/tests/unit/test_runner.py
+++ b/tests/unit/test_runner.py
@@ -122,6 +122,7 @@ def _make_runner(
     stream: bool = False,
     on_chunk=None,
     budget_tokens: int = 100_000,
+    reasoning_replay: str = "none",
 ) -> WorkflowRunner:
     """Create a WorkflowRunner with NoCompact strategy and generous budget."""
     ctx = ContextManager(strategy=NoCompact(), budget_tokens=budget_tokens)
@@ -133,6 +134,7 @@ def _make_runner(
         max_tool_errors=max_tool_errors,
         stream=stream,
         on_chunk=on_chunk,
+        reasoning_replay=reasoning_replay,
     )
 
 
@@ -1335,8 +1337,8 @@ def spy_compact(messages, step_index=0, step_hint=""):
         assert types[reasoning_idx + 1] == MessageType.TOOL_CALL
 
     @pytest.mark.asyncio
-    async def test_reasoning_folded_into_tool_call_on_wire(self):
-        """Reasoning is folded into the tool_call message's content on the wire."""
+    async def test_reasoning_not_on_wire_by_default(self):
+        """Default reasoning_replay="none": reasoning never reaches the wire."""
         client = MockClient([
             ToolCall(tool="fetch", args={}, reasoning="Thinking about this..."),
             ToolCall(tool="submit", args={}),
@@ -1344,6 +1346,24 @@ async def test_reasoning_folded_into_tool_call_on_wire(self):
         runner = _make_runner(client)
         await runner.run(_make_workflow(), "go", prompt_vars={"role": "agent"})
 
+        # Second send call: the tool_call message carries no reasoning content
+        second_call_msgs = client.send_calls[1][0]
+        # system, user, tool_call(assistant), tool_result(tool)
+        assert len(second_call_msgs) == 4
+        assert second_call_msgs[2]["role"] == "assistant"
+        assert second_call_msgs[2]["content"] == ""
+        assert "tool_calls" in second_call_msgs[2]
+
+    @pytest.mark.asyncio
+    async def test_reasoning_folded_into_tool_call_on_wire(self):
+        """With a replaying policy, reasoning is folded into the tool_call message's content on the wire."""
+        client = MockClient([
+            ToolCall(tool="fetch", args={}, reasoning="Thinking about this..."),
+            ToolCall(tool="submit", args={}),
+        ])
+        runner = _make_runner(client, reasoning_replay="keep-last")
+        await runner.run(_make_workflow(), "go", prompt_vars={"role": "agent"})
+
         # Second send call: reasoning folded into tool_call content
         second_call_msgs = client.send_calls[1][0]
         # system, user, tool_call(assistant with content), tool_result(tool)
@@ -1360,7 +1380,7 @@ async def test_text_response_not_folded_into_tool_call(self):
             ToolCall(tool="fetch", args={}, reasoning="Now I know what to do"),
             ToolCall(tool="submit", args={}),
         ])
-        runner = _make_runner(client)
+        runner = _make_runner(client, reasoning_replay="keep-last")
         await runner.run(_make_workflow(), "go", prompt_vars={"role": "agent"})
 
         # Third send call: after text_response+nudge recovery, then reasoning+fetch

From d6b4a57155ed8a31d9018356e37dbb049a9aa984 Mon Sep 17 00:00:00 2001
From: Antoine Zambelli <antoine.zambelli@gmail.com>
Date: Thu, 11 Jun 2026 20:39:58 -0500
Subject: [PATCH 11/14] data(eval): re-stamp Haiku v0.7.5 rows reasoning_replay
 keep-last -> none

The Haiku baseline ran before the default-policy decision and recorded
keep-last; Sonnet/Opus recorded none. The knob is request-inert for
Claude rows (no captured reasoning is replayed), so the field is a label,
not a behavioral difference - re-stamped for a consistent board. Targeted
byte-level edit of the 3,900 Haiku rows; all other lines byte-identical.
Post-edit validation: 170,300 rows, 0 bad JSON, 0 duplicate run keys.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---
 eval_results_v0.7.5.jsonl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/eval_results_v0.7.5.jsonl b/eval_results_v0.7.5.jsonl
index 9182af5..06e82f1 100644
--- a/eval_results_v0.7.5.jsonl
+++ b/eval_results_v0.7.5.jsonl
@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8d0feb4f9f4456ba7b45cdb9a9fe407f593abecf7293c40fc40d30a12ebccfba
-size 106268073
+oid sha256:7cf0797e5f77e04871635d4e61fb473922c4ac1de1b028954ec39476dca4c787
+size 106248573

From 81dcbfb8f49452fa828f497ff76c21943552e169 Mon Sep 17 00:00:00 2001
From: Antoine Zambelli <antoine.zambelli@gmail.com>
Date: Thu, 11 Jun 2026 20:40:07 -0500
Subject: [PATCH 12/14] feat(eval): reasoning_replay as a first-class
 report/dashboard dimension

ConfigKey (display identity) gains the policy so none/keep-last/full
render as separate rows, tagged :keep-last/:full (untagged = the none
default; pre-knob rows count as full - that is what they ran). The dedup
identity (_config_tuple) deliberately excludes it so latest-gen-wins
still supersedes pre-knob rows whole-config instead of keeping them as
stale :full duplicates. Adds the reasoning-replay.md policy-comparison
view, a --reasoning-replay report filter, a Reasoning Replay dashboard
filter dimension with canonical ordering, and the gen-3 legend entry
(tag ref v0.7.5; the squash SHA does not exist pre-merge). Reports and
dashboard regenerated from all four dataset files.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---
 docs/results/dashboard.html             |  13 +-
 docs/results/index.md                   |   3 +-
 docs/results/raw/native-vs-prompt.md    | 232 ++++----
 docs/results/raw/reasoning-replay.md    | 420 ++++++++++++++
 docs/results/raw/reforged-vs-bare.md    | 708 +++++++++++++-----------
 docs/results/raw/reforged/all.md        | 151 +++--
 docs/results/raw/reforged/by-backend.md | 172 +++---
 docs/results/raw/reforged/by-family.md  | 263 +++++----
 tests/eval/dashboard/src/Sidebar.tsx    |   8 +-
 tests/eval/dashboard/src/types.ts       |  10 +-
 tests/eval/dashboard/src/utils.ts       |  10 +-
 tests/eval/report.py                    | 120 +++-
 12 files changed, 1434 insertions(+), 676 deletions(-)
 create mode 100644 docs/results/raw/reasoning-replay.md

diff --git a/docs/results/dashboard.html b/docs/results/dashboard.html
index 530f8d6..1519700 100644
--- a/docs/results/dashboard.html
+++ b/docs/results/dashboard.html
@@ -4,19 +4,20 @@
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
     <title>Forge Eval Dashboard</title>
-    <script type="module" crossorigin>(function(){const M=document.createElement("link").relList;if(M&&M.supports&&M.supports("modulepreload"))return;for(const q of document.querySelectorAll('link[rel="modulepreload"]'))o(q);new MutationObserver(q=>{for(const Y of q)if(Y.type==="childList")for(const H of Y.addedNodes)H.tagName==="LINK"&&H.rel==="modulepreload"&&o(H)}).observe(document,{childList:!0,subtree:!0});function C(q){const Y={};return q.integrity&&(Y.integrity=q.integrity),q.referrerPolicy&&(Y.referrerPolicy=q.referrerPolicy),q.crossOrigin==="use-credentials"?Y.credentials="include":q.crossOrigin==="anonymous"?Y.credentials="omit":Y.credentials="same-origin",Y}function o(q){if(q.ep)return;q.ep=!0;const Y=C(q);fetch(q.href,Y)}})();var cf={exports:{}},bu={};var bo;function ar(){if(bo)return bu;bo=1;var g=Symbol.for("react.transitional.element"),M=Symbol.for("react.fragment");function C(o,q,Y){var H=null;if(Y!==void 0&&(H=""+Y),q.key!==void 0&&(H=""+q.key),"key"in q){Y={};for(var R in q)R!=="key"&&(Y[R]=q[R])}else Y=q;return q=Y.ref,{$$typeof:g,type:o,key:H,ref:q!==void 0?q:null,props:Y}}return bu.Fragment=M,bu.jsx=C,bu.jsxs=C,bu}var zo;function ur(){return zo||(zo=1,cf.exports=ar()),cf.exports}var _=ur(),ff={exports:{}},L={};var Eo;function nr(){if(Eo)return L;Eo=1;var g=Symbol.for("react.transitional.element"),M=Symbol.for("react.portal"),C=Symbol.for("react.fragment"),o=Symbol.for("react.strict_mode"),q=Symbol.for("react.profiler"),Y=Symbol.for("react.consumer"),H=Symbol.for("react.context"),R=Symbol.for("react.forward_ref"),p=Symbol.for("react.suspense"),E=Symbol.for("react.memo"),G=Symbol.for("react.lazy"),U=Symbol.for("react.activity"),x=Symbol.iterator;function tl(d){return d===null||typeof d!="object"?null:(d=x&&d[x]||d["@@iterator"],typeof d=="function"?d:null)}var I={isMounted:function(){return!1},enqueueForceUpdate:function(){},enqueueReplaceState:function(){},enqueueSetState:function(){}},vl=Object.assign,al={};function pl(d,v,O){this.props=d,this.context=v,this.refs=al,this.updater=O||I}pl.prototype.isReactComponent={},pl.prototype.setState=function(d,v){if(typeof d!="object"&&typeof d!="function"&&d!=null)throw Error("takes an object of state variables to update or a function which returns an object of state variables.");this.updater.enqueueSetState(this,d,v,"setState")},pl.prototype.forceUpdate=function(d){this.updater.enqueueForceUpdate(this,d,"forceUpdate")};function $(){}$.prototype=pl.prototype;function ul(d,v,O){this.props=d,this.context=v,this.refs=al,this.updater=O||I}var nl=ul.prototype=new $;nl.constructor=ul,vl(nl,pl.prototype),nl.isPureReactComponent=!0;var Gl=Array.isArray;function El(){}var W={H:null,A:null,T:null,S:null},Hl=Object.prototype.hasOwnProperty;function Kl(d,v,O){var D=O.ref;return{$$typeof:g,type:d,key:v,ref:D!==void 0?D:null,props:O}}function $l(d,v){return Kl(d.type,v,d.props)}function kl(d){return typeof d=="object"&&d!==null&&d.$$typeof===g}function fl(d){var v={"=":"=0",":":"=2"};return"$"+d.replace(/[=:]/g,function(O){return v[O]})}var Rt=/\/+/g;function _t(d,v){return typeof d=="object"&&d!==null&&d.key!=null?fl(""+d.key):v.toString(36)}function ut(d){switch(d.status){case"fulfilled":return d.value;case"rejected":throw d.reason;default:switch(typeof d.status=="string"?d.then(El,El):(d.status="pending",d.then(function(v){d.status==="pending"&&(d.status="fulfilled",d.value=v)},function(v){d.status==="pending"&&(d.status="rejected",d.reason=v)})),d.status){case"fulfilled":return d.value;case"rejected":throw d.reason}}throw d}function z(d,v,O,D,X){var K=typeof d;(K==="undefined"||K==="boolean")&&(d=null);var ol=!1;if(d===null)ol=!0;else switch(K){case"bigint":case"string":case"number":ol=!0;break;case"object":switch(d.$$typeof){case g:case M:ol=!0;break;case G:return ol=d._init,z(ol(d._payload),v,O,D,X)}}if(ol)return X=X(d),ol=D===""?"."+_t(d,0):D,Gl(X)?(O="",ol!=null&&(O=ol.replace(Rt,"$&/")+"/"),z(X,v,O,"",function(Oa){return Oa})):X!=null&&(kl(X)&&(X=$l(X,O+(X.key==null||d&&d.key===X.key?"":(""+X.key).replace(Rt,"$&/")+"/")+ol)),v.push(X)),1;ol=0;var wl=D===""?".":D+":";if(Gl(d))for(var xl=0;xl<d.length;xl++)D=d[xl],K=wl+_t(D,xl),ol+=z(D,v,O,K,X);else if(xl=tl(d),typeof xl=="function")for(d=xl.call(d),xl=0;!(D=d.next()).done;)D=D.value,K=wl+_t(D,xl++),ol+=z(D,v,O,K,X);else if(K==="object"){if(typeof d.then=="function")return z(ut(d),v,O,D,X);throw v=String(d),Error("Objects are not valid as a React child (found: "+(v==="[object Object]"?"object with keys {"+Object.keys(d).join(", ")+"}":v)+"). If you meant to render a collection of children, use an array instead.")}return ol}function N(d,v,O){if(d==null)return d;var D=[],X=0;return z(d,D,"","",function(K){return v.call(O,K,X++)}),D}function V(d){if(d._status===-1){var v=d._result;v=v(),v.then(function(O){(d._status===0||d._status===-1)&&(d._status=1,d._result=O)},function(O){(d._status===0||d._status===-1)&&(d._status=2,d._result=O)}),d._status===-1&&(d._status=0,d._result=v)}if(d._status===1)return d._result.default;throw d._result}var dl=typeof reportError=="function"?reportError:function(d){if(typeof window=="object"&&typeof window.ErrorEvent=="function"){var v=new window.ErrorEvent("error",{bubbles:!0,cancelable:!0,message:typeof d=="object"&&d!==null&&typeof d.message=="string"?String(d.message):String(d),error:d});if(!window.dispatchEvent(v))return}else if(typeof process=="object"&&typeof process.emit=="function"){process.emit("uncaughtException",d);return}console.error(d)},yl={map:N,forEach:function(d,v,O){N(d,function(){v.apply(this,arguments)},O)},count:function(d){var v=0;return N(d,function(){v++}),v},toArray:function(d){return N(d,function(v){return v})||[]},only:function(d){if(!kl(d))throw Error("React.Children.only expected to receive a single React element child.");return d}};return L.Activity=U,L.Children=yl,L.Component=pl,L.Fragment=C,L.Profiler=q,L.PureComponent=ul,L.StrictMode=o,L.Suspense=p,L.__CLIENT_INTERNALS_DO_NOT_USE_OR_WARN_USERS_THEY_CANNOT_UPGRADE=W,L.__COMPILER_RUNTIME={__proto__:null,c:function(d){return W.H.useMemoCache(d)}},L.cache=function(d){return function(){return d.apply(null,arguments)}},L.cacheSignal=function(){return null},L.cloneElement=function(d,v,O){if(d==null)throw Error("The argument must be a React element, but you passed "+d+".");var D=vl({},d.props),X=d.key;if(v!=null)for(K in v.key!==void 0&&(X=""+v.key),v)!Hl.call(v,K)||K==="key"||K==="__self"||K==="__source"||K==="ref"&&v.ref===void 0||(D[K]=v[K]);var K=arguments.length-2;if(K===1)D.children=O;else if(1<K){for(var ol=Array(K),wl=0;wl<K;wl++)ol[wl]=arguments[wl+2];D.children=ol}return Kl(d.type,X,D)},L.createContext=function(d){return d={$$typeof:H,_currentValue:d,_currentValue2:d,_threadCount:0,Provider:null,Consumer:null},d.Provider=d,d.Consumer={$$typeof:Y,_context:d},d},L.createElement=function(d,v,O){var D,X={},K=null;if(v!=null)for(D in v.key!==void 0&&(K=""+v.key),v)Hl.call(v,D)&&D!=="key"&&D!=="__self"&&D!=="__source"&&(X[D]=v[D]);var ol=arguments.length-2;if(ol===1)X.children=O;else if(1<ol){for(var wl=Array(ol),xl=0;xl<ol;xl++)wl[xl]=arguments[xl+2];X.children=wl}if(d&&d.defaultProps)for(D in ol=d.defaultProps,ol)X[D]===void 0&&(X[D]=ol[D]);return Kl(d,K,X)},L.createRef=function(){return{current:null}},L.forwardRef=function(d){return{$$typeof:R,render:d}},L.isValidElement=kl,L.lazy=function(d){return{$$typeof:G,_payload:{_status:-1,_result:d},_init:V}},L.memo=function(d,v){return{$$typeof:E,type:d,compare:v===void 0?null:v}},L.startTransition=function(d){var v=W.T,O={};W.T=O;try{var D=d(),X=W.S;X!==null&&X(O,D),typeof D=="object"&&D!==null&&typeof D.then=="function"&&D.then(El,dl)}catch(K){dl(K)}finally{v!==null&&O.types!==null&&(v.types=O.types),W.T=v}},L.unstable_useCacheRefresh=function(){return W.H.useCacheRefresh()},L.use=function(d){return W.H.use(d)},L.useActionState=function(d,v,O){return W.H.useActionState(d,v,O)},L.useCallback=function(d,v){return W.H.useCallback(d,v)},L.useContext=function(d){return W.H.useContext(d)},L.useDebugValue=function(){},L.useDeferredValue=function(d,v){return W.H.useDeferredValue(d,v)},L.useEffect=function(d,v){return W.H.useEffect(d,v)},L.useEffectEvent=function(d){return W.H.useEffectEvent(d)},L.useId=function(){return W.H.useId()},L.useImperativeHandle=function(d,v,O){return W.H.useImperativeHandle(d,v,O)},L.useInsertionEffect=function(d,v){return W.H.useInsertionEffect(d,v)},L.useLayoutEffect=function(d,v){return W.H.useLayoutEffect(d,v)},L.useMemo=function(d,v){return W.H.useMemo(d,v)},L.useOptimistic=function(d,v){return W.H.useOptimistic(d,v)},L.useReducer=function(d,v,O){return W.H.useReducer(d,v,O)},L.useRef=function(d){return W.H.useRef(d)},L.useState=function(d){return W.H.useState(d)},L.useSyncExternalStore=function(d,v,O){return W.H.useSyncExternalStore(d,v,O)},L.useTransition=function(){return W.H.useTransition()},L.version="19.2.4",L}var To;function hf(){return To||(To=1,ff.exports=nr()),ff.exports}var ml=hf(),sf={exports:{}},zu={},df={exports:{}},of={};var Ao;function cr(){return Ao||(Ao=1,(function(g){function M(z,N){var V=z.length;z.push(N);l:for(;0<V;){var dl=V-1>>>1,yl=z[dl];if(0<q(yl,N))z[dl]=N,z[V]=yl,V=dl;else break l}}function C(z){return z.length===0?null:z[0]}function o(z){if(z.length===0)return null;var N=z[0],V=z.pop();if(V!==N){z[0]=V;l:for(var dl=0,yl=z.length,d=yl>>>1;dl<d;){var v=2*(dl+1)-1,O=z[v],D=v+1,X=z[D];if(0>q(O,V))D<yl&&0>q(X,O)?(z[dl]=X,z[D]=V,dl=D):(z[dl]=O,z[v]=V,dl=v);else if(D<yl&&0>q(X,V))z[dl]=X,z[D]=V,dl=D;else break l}}return N}function q(z,N){var V=z.sortIndex-N.sortIndex;return V!==0?V:z.id-N.id}if(g.unstable_now=void 0,typeof performance=="object"&&typeof performance.now=="function"){var Y=performance;g.unstable_now=function(){return Y.now()}}else{var H=Date,R=H.now();g.unstable_now=function(){return H.now()-R}}var p=[],E=[],G=1,U=null,x=3,tl=!1,I=!1,vl=!1,al=!1,pl=typeof setTimeout=="function"?setTimeout:null,$=typeof clearTimeout=="function"?clearTimeout:null,ul=typeof setImmediate<"u"?setImmediate:null;function nl(z){for(var N=C(E);N!==null;){if(N.callback===null)o(E);else if(N.startTime<=z)o(E),N.sortIndex=N.expirationTime,M(p,N);else break;N=C(E)}}function Gl(z){if(vl=!1,nl(z),!I)if(C(p)!==null)I=!0,El||(El=!0,fl());else{var N=C(E);N!==null&&ut(Gl,N.startTime-z)}}var El=!1,W=-1,Hl=5,Kl=-1;function $l(){return al?!0:!(g.unstable_now()-Kl<Hl)}function kl(){if(al=!1,El){var z=g.unstable_now();Kl=z;var N=!0;try{l:{I=!1,vl&&(vl=!1,$(W),W=-1),tl=!0;var V=x;try{t:{for(nl(z),U=C(p);U!==null&&!(U.expirationTime>z&&$l());){var dl=U.callback;if(typeof dl=="function"){U.callback=null,x=U.priorityLevel;var yl=dl(U.expirationTime<=z);if(z=g.unstable_now(),typeof yl=="function"){U.callback=yl,nl(z),N=!0;break t}U===C(p)&&o(p),nl(z)}else o(p);U=C(p)}if(U!==null)N=!0;else{var d=C(E);d!==null&&ut(Gl,d.startTime-z),N=!1}}break l}finally{U=null,x=V,tl=!1}N=void 0}}finally{N?fl():El=!1}}}var fl;if(typeof ul=="function")fl=function(){ul(kl)};else if(typeof MessageChannel<"u"){var Rt=new MessageChannel,_t=Rt.port2;Rt.port1.onmessage=kl,fl=function(){_t.postMessage(null)}}else fl=function(){pl(kl,0)};function ut(z,N){W=pl(function(){z(g.unstable_now())},N)}g.unstable_IdlePriority=5,g.unstable_ImmediatePriority=1,g.unstable_LowPriority=4,g.unstable_NormalPriority=3,g.unstable_Profiling=null,g.unstable_UserBlockingPriority=2,g.unstable_cancelCallback=function(z){z.callback=null},g.unstable_forceFrameRate=function(z){0>z||125<z?console.error("forceFrameRate takes a positive int between 0 and 125, forcing frame rates higher than 125 fps is not supported"):Hl=0<z?Math.floor(1e3/z):5},g.unstable_getCurrentPriorityLevel=function(){return x},g.unstable_next=function(z){switch(x){case 1:case 2:case 3:var N=3;break;default:N=x}var V=x;x=N;try{return z()}finally{x=V}},g.unstable_requestPaint=function(){al=!0},g.unstable_runWithPriority=function(z,N){switch(z){case 1:case 2:case 3:case 4:case 5:break;default:z=3}var V=x;x=z;try{return N()}finally{x=V}},g.unstable_scheduleCallback=function(z,N,V){var dl=g.unstable_now();switch(typeof V=="object"&&V!==null?(V=V.delay,V=typeof V=="number"&&0<V?dl+V:dl):V=dl,z){case 1:var yl=-1;break;case 2:yl=250;break;case 5:yl=1073741823;break;case 4:yl=1e4;break;default:yl=5e3}return yl=V+yl,z={id:G++,callback:N,priorityLevel:z,startTime:V,expirationTime:yl,sortIndex:-1},V>dl?(z.sortIndex=V,M(E,z),C(p)===null&&z===C(E)&&(vl?($(W),W=-1):vl=!0,ut(Gl,V-dl))):(z.sortIndex=yl,M(p,z),I||tl||(I=!0,El||(El=!0,fl()))),z},g.unstable_shouldYield=$l,g.unstable_wrapCallback=function(z){var N=x;return function(){var V=x;x=N;try{return z.apply(this,arguments)}finally{x=V}}}})(of)),of}var po;function ir(){return po||(po=1,df.exports=cr()),df.exports}var mf={exports:{}},Jl={};var _o;function fr(){if(_o)return Jl;_o=1;var g=hf();function M(p){var E="https://react.dev/errors/"+p;if(1<arguments.length){E+="?args[]="+encodeURIComponent(arguments[1]);for(var G=2;G<arguments.length;G++)E+="&args[]="+encodeURIComponent(arguments[G])}return"Minified React error #"+p+"; visit "+E+" for the full message or use the non-minified dev environment for full errors and additional helpful warnings."}function C(){}var o={d:{f:C,r:function(){throw Error(M(522))},D:C,C,L:C,m:C,X:C,S:C,M:C},p:0,findDOMNode:null},q=Symbol.for("react.portal");function Y(p,E,G){var U=3<arguments.length&&arguments[3]!==void 0?arguments[3]:null;return{$$typeof:q,key:U==null?null:""+U,children:p,containerInfo:E,implementation:G}}var H=g.__CLIENT_INTERNALS_DO_NOT_USE_OR_WARN_USERS_THEY_CANNOT_UPGRADE;function R(p,E){if(p==="font")return"";if(typeof E=="string")return E==="use-credentials"?E:""}return Jl.__DOM_INTERNALS_DO_NOT_USE_OR_WARN_USERS_THEY_CANNOT_UPGRADE=o,Jl.createPortal=function(p,E){var G=2<arguments.length&&arguments[2]!==void 0?arguments[2]:null;if(!E||E.nodeType!==1&&E.nodeType!==9&&E.nodeType!==11)throw Error(M(299));return Y(p,E,null,G)},Jl.flushSync=function(p){var E=H.T,G=o.p;try{if(H.T=null,o.p=2,p)return p()}finally{H.T=E,o.p=G,o.d.f()}},Jl.preconnect=function(p,E){typeof p=="string"&&(E?(E=E.crossOrigin,E=typeof E=="string"?E==="use-credentials"?E:"":void 0):E=null,o.d.C(p,E))},Jl.prefetchDNS=function(p){typeof p=="string"&&o.d.D(p)},Jl.preinit=function(p,E){if(typeof p=="string"&&E&&typeof E.as=="string"){var G=E.as,U=R(G,E.crossOrigin),x=typeof E.integrity=="string"?E.integrity:void 0,tl=typeof E.fetchPriority=="string"?E.fetchPriority:void 0;G==="style"?o.d.S(p,typeof E.precedence=="string"?E.precedence:void 0,{crossOrigin:U,integrity:x,fetchPriority:tl}):G==="script"&&o.d.X(p,{crossOrigin:U,integrity:x,fetchPriority:tl,nonce:typeof E.nonce=="string"?E.nonce:void 0})}},Jl.preinitModule=function(p,E){if(typeof p=="string")if(typeof E=="object"&&E!==null){if(E.as==null||E.as==="script"){var G=R(E.as,E.crossOrigin);o.d.M(p,{crossOrigin:G,integrity:typeof E.integrity=="string"?E.integrity:void 0,nonce:typeof E.nonce=="string"?E.nonce:void 0})}}else E==null&&o.d.M(p)},Jl.preload=function(p,E){if(typeof p=="string"&&typeof E=="object"&&E!==null&&typeof E.as=="string"){var G=E.as,U=R(G,E.crossOrigin);o.d.L(p,G,{crossOrigin:U,integrity:typeof E.integrity=="string"?E.integrity:void 0,nonce:typeof E.nonce=="string"?E.nonce:void 0,type:typeof E.type=="string"?E.type:void 0,fetchPriority:typeof E.fetchPriority=="string"?E.fetchPriority:void 0,referrerPolicy:typeof E.referrerPolicy=="string"?E.referrerPolicy:void 0,imageSrcSet:typeof E.imageSrcSet=="string"?E.imageSrcSet:void 0,imageSizes:typeof E.imageSizes=="string"?E.imageSizes:void 0,media:typeof E.media=="string"?E.media:void 0})}},Jl.preloadModule=function(p,E){if(typeof p=="string")if(E){var G=R(E.as,E.crossOrigin);o.d.m(p,{as:typeof E.as=="string"&&E.as!=="script"?E.as:void 0,crossOrigin:G,integrity:typeof E.integrity=="string"?E.integrity:void 0})}else o.d.m(p)},Jl.requestFormReset=function(p){o.d.r(p)},Jl.unstable_batchedUpdates=function(p,E){return p(E)},Jl.useFormState=function(p,E,G){return H.H.useFormState(p,E,G)},Jl.useFormStatus=function(){return H.H.useHostTransitionStatus()},Jl.version="19.2.4",Jl}var Oo;function sr(){if(Oo)return mf.exports;Oo=1;function g(){if(!(typeof __REACT_DEVTOOLS_GLOBAL_HOOK__>"u"||typeof __REACT_DEVTOOLS_GLOBAL_HOOK__.checkDCE!="function"))try{__REACT_DEVTOOLS_GLOBAL_HOOK__.checkDCE(g)}catch(M){console.error(M)}}return g(),mf.exports=fr(),mf.exports}var Mo;function dr(){if(Mo)return zu;Mo=1;var g=ir(),M=hf(),C=sr();function o(l){var t="https://react.dev/errors/"+l;if(1<arguments.length){t+="?args[]="+encodeURIComponent(arguments[1]);for(var e=2;e<arguments.length;e++)t+="&args[]="+encodeURIComponent(arguments[e])}return"Minified React error #"+l+"; visit "+t+" for the full message or use the non-minified dev environment for full errors and additional helpful warnings."}function q(l){return!(!l||l.nodeType!==1&&l.nodeType!==9&&l.nodeType!==11)}function Y(l){var t=l,e=l;if(l.alternate)for(;t.return;)t=t.return;else{l=t;do t=l,(t.flags&4098)!==0&&(e=t.return),l=t.return;while(l)}return t.tag===3?e:null}function H(l){if(l.tag===13){var t=l.memoizedState;if(t===null&&(l=l.alternate,l!==null&&(t=l.memoizedState)),t!==null)return t.dehydrated}return null}function R(l){if(l.tag===31){var t=l.memoizedState;if(t===null&&(l=l.alternate,l!==null&&(t=l.memoizedState)),t!==null)return t.dehydrated}return null}function p(l){if(Y(l)!==l)throw Error(o(188))}function E(l){var t=l.alternate;if(!t){if(t=Y(l),t===null)throw Error(o(188));return t!==l?null:l}for(var e=l,a=t;;){var u=e.return;if(u===null)break;var n=u.alternate;if(n===null){if(a=u.return,a!==null){e=a;continue}break}if(u.child===n.child){for(n=u.child;n;){if(n===e)return p(u),l;if(n===a)return p(u),t;n=n.sibling}throw Error(o(188))}if(e.return!==a.return)e=u,a=n;else{for(var c=!1,i=u.child;i;){if(i===e){c=!0,e=u,a=n;break}if(i===a){c=!0,a=u,e=n;break}i=i.sibling}if(!c){for(i=n.child;i;){if(i===e){c=!0,e=n,a=u;break}if(i===a){c=!0,a=n,e=u;break}i=i.sibling}if(!c)throw Error(o(189))}}if(e.alternate!==a)throw Error(o(190))}if(e.tag!==3)throw Error(o(188));return e.stateNode.current===e?l:t}function G(l){var t=l.tag;if(t===5||t===26||t===27||t===6)return l;for(l=l.child;l!==null;){if(t=G(l),t!==null)return t;l=l.sibling}return null}var U=Object.assign,x=Symbol.for("react.element"),tl=Symbol.for("react.transitional.element"),I=Symbol.for("react.portal"),vl=Symbol.for("react.fragment"),al=Symbol.for("react.strict_mode"),pl=Symbol.for("react.profiler"),$=Symbol.for("react.consumer"),ul=Symbol.for("react.context"),nl=Symbol.for("react.forward_ref"),Gl=Symbol.for("react.suspense"),El=Symbol.for("react.suspense_list"),W=Symbol.for("react.memo"),Hl=Symbol.for("react.lazy"),Kl=Symbol.for("react.activity"),$l=Symbol.for("react.memo_cache_sentinel"),kl=Symbol.iterator;function fl(l){return l===null||typeof l!="object"?null:(l=kl&&l[kl]||l["@@iterator"],typeof l=="function"?l:null)}var Rt=Symbol.for("react.client.reference");function _t(l){if(l==null)return null;if(typeof l=="function")return l.$$typeof===Rt?null:l.displayName||l.name||null;if(typeof l=="string")return l;switch(l){case vl:return"Fragment";case pl:return"Profiler";case al:return"StrictMode";case Gl:return"Suspense";case El:return"SuspenseList";case Kl:return"Activity"}if(typeof l=="object")switch(l.$$typeof){case I:return"Portal";case ul:return l.displayName||"Context";case $:return(l._context.displayName||"Context")+".Consumer";case nl:var t=l.render;return l=l.displayName,l||(l=t.displayName||t.name||"",l=l!==""?"ForwardRef("+l+")":"ForwardRef"),l;case W:return t=l.displayName||null,t!==null?t:_t(l.type)||"Memo";case Hl:t=l._payload,l=l._init;try{return _t(l(t))}catch{}}return null}var ut=Array.isArray,z=M.__CLIENT_INTERNALS_DO_NOT_USE_OR_WARN_USERS_THEY_CANNOT_UPGRADE,N=C.__DOM_INTERNALS_DO_NOT_USE_OR_WARN_USERS_THEY_CANNOT_UPGRADE,V={pending:!1,data:null,method:null,action:null},dl=[],yl=-1;function d(l){return{current:l}}function v(l){0>yl||(l.current=dl[yl],dl[yl]=null,yl--)}function O(l,t){yl++,dl[yl]=l.current,l.current=t}var D=d(null),X=d(null),K=d(null),ol=d(null);function wl(l,t){switch(O(K,t),O(X,l),O(D,null),t.nodeType){case 9:case 11:l=(l=t.documentElement)&&(l=l.namespaceURI)?Xd(l):0;break;default:if(l=t.tagName,t=t.namespaceURI)t=Xd(t),l=Zd(t,l);else switch(l){case"svg":l=1;break;case"math":l=2;break;default:l=0}}v(D),O(D,l)}function xl(){v(D),v(X),v(K)}function Oa(l){l.memoizedState!==null&&O(ol,l);var t=D.current,e=Zd(t,l.type);t!==e&&(O(X,l),O(D,e))}function Tu(l){X.current===l&&(v(D),v(X)),ol.current===l&&(v(ol),hu._currentValue=V)}var Zn,gf;function Ae(l){if(Zn===void 0)try{throw Error()}catch(e){var t=e.stack.trim().match(/\n( *(at )?)/);Zn=t&&t[1]||"",gf=-1<e.stack.indexOf(`
+    <script type="module" crossorigin>(function(){const M=document.createElement("link").relList;if(M&&M.supports&&M.supports("modulepreload"))return;for(const q of document.querySelectorAll('link[rel="modulepreload"]'))o(q);new MutationObserver(q=>{for(const Y of q)if(Y.type==="childList")for(const H of Y.addedNodes)H.tagName==="LINK"&&H.rel==="modulepreload"&&o(H)}).observe(document,{childList:!0,subtree:!0});function C(q){const Y={};return q.integrity&&(Y.integrity=q.integrity),q.referrerPolicy&&(Y.referrerPolicy=q.referrerPolicy),q.crossOrigin==="use-credentials"?Y.credentials="include":q.crossOrigin==="anonymous"?Y.credentials="omit":Y.credentials="same-origin",Y}function o(q){if(q.ep)return;q.ep=!0;const Y=C(q);fetch(q.href,Y)}})();var ff={exports:{}},bu={};var zo;function nr(){if(zo)return bu;zo=1;var h=Symbol.for("react.transitional.element"),M=Symbol.for("react.fragment");function C(o,q,Y){var H=null;if(Y!==void 0&&(H=""+Y),q.key!==void 0&&(H=""+q.key),"key"in q){Y={};for(var R in q)R!=="key"&&(Y[R]=q[R])}else Y=q;return q=Y.ref,{$$typeof:h,type:o,key:H,ref:q!==void 0?q:null,props:Y}}return bu.Fragment=M,bu.jsx=C,bu.jsxs=C,bu}var Eo;function cr(){return Eo||(Eo=1,ff.exports=nr()),ff.exports}var _=cr(),sf={exports:{}},L={};var To;function ir(){if(To)return L;To=1;var h=Symbol.for("react.transitional.element"),M=Symbol.for("react.portal"),C=Symbol.for("react.fragment"),o=Symbol.for("react.strict_mode"),q=Symbol.for("react.profiler"),Y=Symbol.for("react.consumer"),H=Symbol.for("react.context"),R=Symbol.for("react.forward_ref"),p=Symbol.for("react.suspense"),E=Symbol.for("react.memo"),G=Symbol.for("react.lazy"),U=Symbol.for("react.activity"),x=Symbol.iterator;function P(d){return d===null||typeof d!="object"?null:(d=x&&d[x]||d["@@iterator"],typeof d=="function"?d:null)}var k={isMounted:function(){return!1},enqueueForceUpdate:function(){},enqueueReplaceState:function(){},enqueueSetState:function(){}},vl=Object.assign,ul={};function zl(d,g,O){this.props=d,this.context=g,this.refs=ul,this.updater=O||k}zl.prototype.isReactComponent={},zl.prototype.setState=function(d,g){if(typeof d!="object"&&typeof d!="function"&&d!=null)throw Error("takes an object of state variables to update or a function which returns an object of state variables.");this.updater.enqueueSetState(this,d,g,"setState")},zl.prototype.forceUpdate=function(d){this.updater.enqueueForceUpdate(this,d,"forceUpdate")};function $(){}$.prototype=zl.prototype;function nl(d,g,O){this.props=d,this.context=g,this.refs=ul,this.updater=O||k}var al=nl.prototype=new $;al.constructor=nl,vl(al,zl.prototype),al.isPureReactComponent=!0;var Hl=Array.isArray;function Tl(){}var W={H:null,A:null,T:null,S:null},Bl=Object.prototype.hasOwnProperty;function Kl(d,g,O){var N=O.ref;return{$$typeof:h,type:d,key:g,ref:N!==void 0?N:null,props:O}}function $l(d,g){return Kl(d.type,g,d.props)}function kl(d){return typeof d=="object"&&d!==null&&d.$$typeof===h}function fl(d){var g={"=":"=0",":":"=2"};return"$"+d.replace(/[=:]/g,function(O){return g[O]})}var Rt=/\/+/g;function _t(d,g){return typeof d=="object"&&d!==null&&d.key!=null?fl(""+d.key):g.toString(36)}function ut(d){switch(d.status){case"fulfilled":return d.value;case"rejected":throw d.reason;default:switch(typeof d.status=="string"?d.then(Tl,Tl):(d.status="pending",d.then(function(g){d.status==="pending"&&(d.status="fulfilled",d.value=g)},function(g){d.status==="pending"&&(d.status="rejected",d.reason=g)})),d.status){case"fulfilled":return d.value;case"rejected":throw d.reason}}throw d}function z(d,g,O,N,X){var K=typeof d;(K==="undefined"||K==="boolean")&&(d=null);var ol=!1;if(d===null)ol=!0;else switch(K){case"bigint":case"string":case"number":ol=!0;break;case"object":switch(d.$$typeof){case h:case M:ol=!0;break;case G:return ol=d._init,z(ol(d._payload),g,O,N,X)}}if(ol)return X=X(d),ol=N===""?"."+_t(d,0):N,Hl(X)?(O="",ol!=null&&(O=ol.replace(Rt,"$&/")+"/"),z(X,g,O,"",function(Oa){return Oa})):X!=null&&(kl(X)&&(X=$l(X,O+(X.key==null||d&&d.key===X.key?"":(""+X.key).replace(Rt,"$&/")+"/")+ol)),g.push(X)),1;ol=0;var wl=N===""?".":N+":";if(Hl(d))for(var xl=0;xl<d.length;xl++)N=d[xl],K=wl+_t(N,xl),ol+=z(N,g,O,K,X);else if(xl=P(d),typeof xl=="function")for(d=xl.call(d),xl=0;!(N=d.next()).done;)N=N.value,K=wl+_t(N,xl++),ol+=z(N,g,O,K,X);else if(K==="object"){if(typeof d.then=="function")return z(ut(d),g,O,N,X);throw g=String(d),Error("Objects are not valid as a React child (found: "+(g==="[object Object]"?"object with keys {"+Object.keys(d).join(", ")+"}":g)+"). If you meant to render a collection of children, use an array instead.")}return ol}function D(d,g,O){if(d==null)return d;var N=[],X=0;return z(d,N,"","",function(K){return g.call(O,K,X++)}),N}function V(d){if(d._status===-1){var g=d._result;g=g(),g.then(function(O){(d._status===0||d._status===-1)&&(d._status=1,d._result=O)},function(O){(d._status===0||d._status===-1)&&(d._status=2,d._result=O)}),d._status===-1&&(d._status=0,d._result=g)}if(d._status===1)return d._result.default;throw d._result}var dl=typeof reportError=="function"?reportError:function(d){if(typeof window=="object"&&typeof window.ErrorEvent=="function"){var g=new window.ErrorEvent("error",{bubbles:!0,cancelable:!0,message:typeof d=="object"&&d!==null&&typeof d.message=="string"?String(d.message):String(d),error:d});if(!window.dispatchEvent(g))return}else if(typeof process=="object"&&typeof process.emit=="function"){process.emit("uncaughtException",d);return}console.error(d)},yl={map:D,forEach:function(d,g,O){D(d,function(){g.apply(this,arguments)},O)},count:function(d){var g=0;return D(d,function(){g++}),g},toArray:function(d){return D(d,function(g){return g})||[]},only:function(d){if(!kl(d))throw Error("React.Children.only expected to receive a single React element child.");return d}};return L.Activity=U,L.Children=yl,L.Component=zl,L.Fragment=C,L.Profiler=q,L.PureComponent=nl,L.StrictMode=o,L.Suspense=p,L.__CLIENT_INTERNALS_DO_NOT_USE_OR_WARN_USERS_THEY_CANNOT_UPGRADE=W,L.__COMPILER_RUNTIME={__proto__:null,c:function(d){return W.H.useMemoCache(d)}},L.cache=function(d){return function(){return d.apply(null,arguments)}},L.cacheSignal=function(){return null},L.cloneElement=function(d,g,O){if(d==null)throw Error("The argument must be a React element, but you passed "+d+".");var N=vl({},d.props),X=d.key;if(g!=null)for(K in g.key!==void 0&&(X=""+g.key),g)!Bl.call(g,K)||K==="key"||K==="__self"||K==="__source"||K==="ref"&&g.ref===void 0||(N[K]=g[K]);var K=arguments.length-2;if(K===1)N.children=O;else if(1<K){for(var ol=Array(K),wl=0;wl<K;wl++)ol[wl]=arguments[wl+2];N.children=ol}return Kl(d.type,X,N)},L.createContext=function(d){return d={$$typeof:H,_currentValue:d,_currentValue2:d,_threadCount:0,Provider:null,Consumer:null},d.Provider=d,d.Consumer={$$typeof:Y,_context:d},d},L.createElement=function(d,g,O){var N,X={},K=null;if(g!=null)for(N in g.key!==void 0&&(K=""+g.key),g)Bl.call(g,N)&&N!=="key"&&N!=="__self"&&N!=="__source"&&(X[N]=g[N]);var ol=arguments.length-2;if(ol===1)X.children=O;else if(1<ol){for(var wl=Array(ol),xl=0;xl<ol;xl++)wl[xl]=arguments[xl+2];X.children=wl}if(d&&d.defaultProps)for(N in ol=d.defaultProps,ol)X[N]===void 0&&(X[N]=ol[N]);return Kl(d,K,X)},L.createRef=function(){return{current:null}},L.forwardRef=function(d){return{$$typeof:R,render:d}},L.isValidElement=kl,L.lazy=function(d){return{$$typeof:G,_payload:{_status:-1,_result:d},_init:V}},L.memo=function(d,g){return{$$typeof:E,type:d,compare:g===void 0?null:g}},L.startTransition=function(d){var g=W.T,O={};W.T=O;try{var N=d(),X=W.S;X!==null&&X(O,N),typeof N=="object"&&N!==null&&typeof N.then=="function"&&N.then(Tl,dl)}catch(K){dl(K)}finally{g!==null&&O.types!==null&&(g.types=O.types),W.T=g}},L.unstable_useCacheRefresh=function(){return W.H.useCacheRefresh()},L.use=function(d){return W.H.use(d)},L.useActionState=function(d,g,O){return W.H.useActionState(d,g,O)},L.useCallback=function(d,g){return W.H.useCallback(d,g)},L.useContext=function(d){return W.H.useContext(d)},L.useDebugValue=function(){},L.useDeferredValue=function(d,g){return W.H.useDeferredValue(d,g)},L.useEffect=function(d,g){return W.H.useEffect(d,g)},L.useEffectEvent=function(d){return W.H.useEffectEvent(d)},L.useId=function(){return W.H.useId()},L.useImperativeHandle=function(d,g,O){return W.H.useImperativeHandle(d,g,O)},L.useInsertionEffect=function(d,g){return W.H.useInsertionEffect(d,g)},L.useLayoutEffect=function(d,g){return W.H.useLayoutEffect(d,g)},L.useMemo=function(d,g){return W.H.useMemo(d,g)},L.useOptimistic=function(d,g){return W.H.useOptimistic(d,g)},L.useReducer=function(d,g,O){return W.H.useReducer(d,g,O)},L.useRef=function(d){return W.H.useRef(d)},L.useState=function(d){return W.H.useState(d)},L.useSyncExternalStore=function(d,g,O){return W.H.useSyncExternalStore(d,g,O)},L.useTransition=function(){return W.H.useTransition()},L.version="19.2.4",L}var Ao;function vf(){return Ao||(Ao=1,sf.exports=ir()),sf.exports}var ml=vf(),df={exports:{}},zu={},of={exports:{}},mf={};var po;function fr(){return po||(po=1,(function(h){function M(z,D){var V=z.length;z.push(D);l:for(;0<V;){var dl=V-1>>>1,yl=z[dl];if(0<q(yl,D))z[dl]=D,z[V]=yl,V=dl;else break l}}function C(z){return z.length===0?null:z[0]}function o(z){if(z.length===0)return null;var D=z[0],V=z.pop();if(V!==D){z[0]=V;l:for(var dl=0,yl=z.length,d=yl>>>1;dl<d;){var g=2*(dl+1)-1,O=z[g],N=g+1,X=z[N];if(0>q(O,V))N<yl&&0>q(X,O)?(z[dl]=X,z[N]=V,dl=N):(z[dl]=O,z[g]=V,dl=g);else if(N<yl&&0>q(X,V))z[dl]=X,z[N]=V,dl=N;else break l}}return D}function q(z,D){var V=z.sortIndex-D.sortIndex;return V!==0?V:z.id-D.id}if(h.unstable_now=void 0,typeof performance=="object"&&typeof performance.now=="function"){var Y=performance;h.unstable_now=function(){return Y.now()}}else{var H=Date,R=H.now();h.unstable_now=function(){return H.now()-R}}var p=[],E=[],G=1,U=null,x=3,P=!1,k=!1,vl=!1,ul=!1,zl=typeof setTimeout=="function"?setTimeout:null,$=typeof clearTimeout=="function"?clearTimeout:null,nl=typeof setImmediate<"u"?setImmediate:null;function al(z){for(var D=C(E);D!==null;){if(D.callback===null)o(E);else if(D.startTime<=z)o(E),D.sortIndex=D.expirationTime,M(p,D);else break;D=C(E)}}function Hl(z){if(vl=!1,al(z),!k)if(C(p)!==null)k=!0,Tl||(Tl=!0,fl());else{var D=C(E);D!==null&&ut(Hl,D.startTime-z)}}var Tl=!1,W=-1,Bl=5,Kl=-1;function $l(){return ul?!0:!(h.unstable_now()-Kl<Bl)}function kl(){if(ul=!1,Tl){var z=h.unstable_now();Kl=z;var D=!0;try{l:{k=!1,vl&&(vl=!1,$(W),W=-1),P=!0;var V=x;try{t:{for(al(z),U=C(p);U!==null&&!(U.expirationTime>z&&$l());){var dl=U.callback;if(typeof dl=="function"){U.callback=null,x=U.priorityLevel;var yl=dl(U.expirationTime<=z);if(z=h.unstable_now(),typeof yl=="function"){U.callback=yl,al(z),D=!0;break t}U===C(p)&&o(p),al(z)}else o(p);U=C(p)}if(U!==null)D=!0;else{var d=C(E);d!==null&&ut(Hl,d.startTime-z),D=!1}}break l}finally{U=null,x=V,P=!1}D=void 0}}finally{D?fl():Tl=!1}}}var fl;if(typeof nl=="function")fl=function(){nl(kl)};else if(typeof MessageChannel<"u"){var Rt=new MessageChannel,_t=Rt.port2;Rt.port1.onmessage=kl,fl=function(){_t.postMessage(null)}}else fl=function(){zl(kl,0)};function ut(z,D){W=zl(function(){z(h.unstable_now())},D)}h.unstable_IdlePriority=5,h.unstable_ImmediatePriority=1,h.unstable_LowPriority=4,h.unstable_NormalPriority=3,h.unstable_Profiling=null,h.unstable_UserBlockingPriority=2,h.unstable_cancelCallback=function(z){z.callback=null},h.unstable_forceFrameRate=function(z){0>z||125<z?console.error("forceFrameRate takes a positive int between 0 and 125, forcing frame rates higher than 125 fps is not supported"):Bl=0<z?Math.floor(1e3/z):5},h.unstable_getCurrentPriorityLevel=function(){return x},h.unstable_next=function(z){switch(x){case 1:case 2:case 3:var D=3;break;default:D=x}var V=x;x=D;try{return z()}finally{x=V}},h.unstable_requestPaint=function(){ul=!0},h.unstable_runWithPriority=function(z,D){switch(z){case 1:case 2:case 3:case 4:case 5:break;default:z=3}var V=x;x=z;try{return D()}finally{x=V}},h.unstable_scheduleCallback=function(z,D,V){var dl=h.unstable_now();switch(typeof V=="object"&&V!==null?(V=V.delay,V=typeof V=="number"&&0<V?dl+V:dl):V=dl,z){case 1:var yl=-1;break;case 2:yl=250;break;case 5:yl=1073741823;break;case 4:yl=1e4;break;default:yl=5e3}return yl=V+yl,z={id:G++,callback:D,priorityLevel:z,startTime:V,expirationTime:yl,sortIndex:-1},V>dl?(z.sortIndex=V,M(E,z),C(p)===null&&z===C(E)&&(vl?($(W),W=-1):vl=!0,ut(Hl,V-dl))):(z.sortIndex=yl,M(p,z),k||P||(k=!0,Tl||(Tl=!0,fl()))),z},h.unstable_shouldYield=$l,h.unstable_wrapCallback=function(z){var D=x;return function(){var V=x;x=D;try{return z.apply(this,arguments)}finally{x=V}}}})(mf)),mf}var _o;function sr(){return _o||(_o=1,of.exports=fr()),of.exports}var yf={exports:{}},Jl={};var Oo;function dr(){if(Oo)return Jl;Oo=1;var h=vf();function M(p){var E="https://react.dev/errors/"+p;if(1<arguments.length){E+="?args[]="+encodeURIComponent(arguments[1]);for(var G=2;G<arguments.length;G++)E+="&args[]="+encodeURIComponent(arguments[G])}return"Minified React error #"+p+"; visit "+E+" for the full message or use the non-minified dev environment for full errors and additional helpful warnings."}function C(){}var o={d:{f:C,r:function(){throw Error(M(522))},D:C,C,L:C,m:C,X:C,S:C,M:C},p:0,findDOMNode:null},q=Symbol.for("react.portal");function Y(p,E,G){var U=3<arguments.length&&arguments[3]!==void 0?arguments[3]:null;return{$$typeof:q,key:U==null?null:""+U,children:p,containerInfo:E,implementation:G}}var H=h.__CLIENT_INTERNALS_DO_NOT_USE_OR_WARN_USERS_THEY_CANNOT_UPGRADE;function R(p,E){if(p==="font")return"";if(typeof E=="string")return E==="use-credentials"?E:""}return Jl.__DOM_INTERNALS_DO_NOT_USE_OR_WARN_USERS_THEY_CANNOT_UPGRADE=o,Jl.createPortal=function(p,E){var G=2<arguments.length&&arguments[2]!==void 0?arguments[2]:null;if(!E||E.nodeType!==1&&E.nodeType!==9&&E.nodeType!==11)throw Error(M(299));return Y(p,E,null,G)},Jl.flushSync=function(p){var E=H.T,G=o.p;try{if(H.T=null,o.p=2,p)return p()}finally{H.T=E,o.p=G,o.d.f()}},Jl.preconnect=function(p,E){typeof p=="string"&&(E?(E=E.crossOrigin,E=typeof E=="string"?E==="use-credentials"?E:"":void 0):E=null,o.d.C(p,E))},Jl.prefetchDNS=function(p){typeof p=="string"&&o.d.D(p)},Jl.preinit=function(p,E){if(typeof p=="string"&&E&&typeof E.as=="string"){var G=E.as,U=R(G,E.crossOrigin),x=typeof E.integrity=="string"?E.integrity:void 0,P=typeof E.fetchPriority=="string"?E.fetchPriority:void 0;G==="style"?o.d.S(p,typeof E.precedence=="string"?E.precedence:void 0,{crossOrigin:U,integrity:x,fetchPriority:P}):G==="script"&&o.d.X(p,{crossOrigin:U,integrity:x,fetchPriority:P,nonce:typeof E.nonce=="string"?E.nonce:void 0})}},Jl.preinitModule=function(p,E){if(typeof p=="string")if(typeof E=="object"&&E!==null){if(E.as==null||E.as==="script"){var G=R(E.as,E.crossOrigin);o.d.M(p,{crossOrigin:G,integrity:typeof E.integrity=="string"?E.integrity:void 0,nonce:typeof E.nonce=="string"?E.nonce:void 0})}}else E==null&&o.d.M(p)},Jl.preload=function(p,E){if(typeof p=="string"&&typeof E=="object"&&E!==null&&typeof E.as=="string"){var G=E.as,U=R(G,E.crossOrigin);o.d.L(p,G,{crossOrigin:U,integrity:typeof E.integrity=="string"?E.integrity:void 0,nonce:typeof E.nonce=="string"?E.nonce:void 0,type:typeof E.type=="string"?E.type:void 0,fetchPriority:typeof E.fetchPriority=="string"?E.fetchPriority:void 0,referrerPolicy:typeof E.referrerPolicy=="string"?E.referrerPolicy:void 0,imageSrcSet:typeof E.imageSrcSet=="string"?E.imageSrcSet:void 0,imageSizes:typeof E.imageSizes=="string"?E.imageSizes:void 0,media:typeof E.media=="string"?E.media:void 0})}},Jl.preloadModule=function(p,E){if(typeof p=="string")if(E){var G=R(E.as,E.crossOrigin);o.d.m(p,{as:typeof E.as=="string"&&E.as!=="script"?E.as:void 0,crossOrigin:G,integrity:typeof E.integrity=="string"?E.integrity:void 0})}else o.d.m(p)},Jl.requestFormReset=function(p){o.d.r(p)},Jl.unstable_batchedUpdates=function(p,E){return p(E)},Jl.useFormState=function(p,E,G){return H.H.useFormState(p,E,G)},Jl.useFormStatus=function(){return H.H.useHostTransitionStatus()},Jl.version="19.2.4",Jl}var Mo;function or(){if(Mo)return yf.exports;Mo=1;function h(){if(!(typeof __REACT_DEVTOOLS_GLOBAL_HOOK__>"u"||typeof __REACT_DEVTOOLS_GLOBAL_HOOK__.checkDCE!="function"))try{__REACT_DEVTOOLS_GLOBAL_HOOK__.checkDCE(h)}catch(M){console.error(M)}}return h(),yf.exports=dr(),yf.exports}var xo;function mr(){if(xo)return zu;xo=1;var h=sr(),M=vf(),C=or();function o(l){var t="https://react.dev/errors/"+l;if(1<arguments.length){t+="?args[]="+encodeURIComponent(arguments[1]);for(var e=2;e<arguments.length;e++)t+="&args[]="+encodeURIComponent(arguments[e])}return"Minified React error #"+l+"; visit "+t+" for the full message or use the non-minified dev environment for full errors and additional helpful warnings."}function q(l){return!(!l||l.nodeType!==1&&l.nodeType!==9&&l.nodeType!==11)}function Y(l){var t=l,e=l;if(l.alternate)for(;t.return;)t=t.return;else{l=t;do t=l,(t.flags&4098)!==0&&(e=t.return),l=t.return;while(l)}return t.tag===3?e:null}function H(l){if(l.tag===13){var t=l.memoizedState;if(t===null&&(l=l.alternate,l!==null&&(t=l.memoizedState)),t!==null)return t.dehydrated}return null}function R(l){if(l.tag===31){var t=l.memoizedState;if(t===null&&(l=l.alternate,l!==null&&(t=l.memoizedState)),t!==null)return t.dehydrated}return null}function p(l){if(Y(l)!==l)throw Error(o(188))}function E(l){var t=l.alternate;if(!t){if(t=Y(l),t===null)throw Error(o(188));return t!==l?null:l}for(var e=l,a=t;;){var u=e.return;if(u===null)break;var n=u.alternate;if(n===null){if(a=u.return,a!==null){e=a;continue}break}if(u.child===n.child){for(n=u.child;n;){if(n===e)return p(u),l;if(n===a)return p(u),t;n=n.sibling}throw Error(o(188))}if(e.return!==a.return)e=u,a=n;else{for(var c=!1,i=u.child;i;){if(i===e){c=!0,e=u,a=n;break}if(i===a){c=!0,a=u,e=n;break}i=i.sibling}if(!c){for(i=n.child;i;){if(i===e){c=!0,e=n,a=u;break}if(i===a){c=!0,a=n,e=u;break}i=i.sibling}if(!c)throw Error(o(189))}}if(e.alternate!==a)throw Error(o(190))}if(e.tag!==3)throw Error(o(188));return e.stateNode.current===e?l:t}function G(l){var t=l.tag;if(t===5||t===26||t===27||t===6)return l;for(l=l.child;l!==null;){if(t=G(l),t!==null)return t;l=l.sibling}return null}var U=Object.assign,x=Symbol.for("react.element"),P=Symbol.for("react.transitional.element"),k=Symbol.for("react.portal"),vl=Symbol.for("react.fragment"),ul=Symbol.for("react.strict_mode"),zl=Symbol.for("react.profiler"),$=Symbol.for("react.consumer"),nl=Symbol.for("react.context"),al=Symbol.for("react.forward_ref"),Hl=Symbol.for("react.suspense"),Tl=Symbol.for("react.suspense_list"),W=Symbol.for("react.memo"),Bl=Symbol.for("react.lazy"),Kl=Symbol.for("react.activity"),$l=Symbol.for("react.memo_cache_sentinel"),kl=Symbol.iterator;function fl(l){return l===null||typeof l!="object"?null:(l=kl&&l[kl]||l["@@iterator"],typeof l=="function"?l:null)}var Rt=Symbol.for("react.client.reference");function _t(l){if(l==null)return null;if(typeof l=="function")return l.$$typeof===Rt?null:l.displayName||l.name||null;if(typeof l=="string")return l;switch(l){case vl:return"Fragment";case zl:return"Profiler";case ul:return"StrictMode";case Hl:return"Suspense";case Tl:return"SuspenseList";case Kl:return"Activity"}if(typeof l=="object")switch(l.$$typeof){case k:return"Portal";case nl:return l.displayName||"Context";case $:return(l._context.displayName||"Context")+".Consumer";case al:var t=l.render;return l=l.displayName,l||(l=t.displayName||t.name||"",l=l!==""?"ForwardRef("+l+")":"ForwardRef"),l;case W:return t=l.displayName||null,t!==null?t:_t(l.type)||"Memo";case Bl:t=l._payload,l=l._init;try{return _t(l(t))}catch{}}return null}var ut=Array.isArray,z=M.__CLIENT_INTERNALS_DO_NOT_USE_OR_WARN_USERS_THEY_CANNOT_UPGRADE,D=C.__DOM_INTERNALS_DO_NOT_USE_OR_WARN_USERS_THEY_CANNOT_UPGRADE,V={pending:!1,data:null,method:null,action:null},dl=[],yl=-1;function d(l){return{current:l}}function g(l){0>yl||(l.current=dl[yl],dl[yl]=null,yl--)}function O(l,t){yl++,dl[yl]=l.current,l.current=t}var N=d(null),X=d(null),K=d(null),ol=d(null);function wl(l,t){switch(O(K,t),O(X,l),O(N,null),t.nodeType){case 9:case 11:l=(l=t.documentElement)&&(l=l.namespaceURI)?Zd(l):0;break;default:if(l=t.tagName,t=t.namespaceURI)t=Zd(t),l=Vd(t,l);else switch(l){case"svg":l=1;break;case"math":l=2;break;default:l=0}}g(N),O(N,l)}function xl(){g(N),g(X),g(K)}function Oa(l){l.memoizedState!==null&&O(ol,l);var t=N.current,e=Vd(t,l.type);t!==e&&(O(X,l),O(N,e))}function Tu(l){X.current===l&&(g(N),g(X)),ol.current===l&&(g(ol),hu._currentValue=V)}var Vn,Sf;function Ae(l){if(Vn===void 0)try{throw Error()}catch(e){var t=e.stack.trim().match(/\n( *(at )?)/);Vn=t&&t[1]||"",Sf=-1<e.stack.indexOf(`
     at`)?" (<anonymous>)":-1<e.stack.indexOf("@")?"@unknown:0:0":""}return`
-`+Zn+l+gf}var Vn=!1;function Ln(l,t){if(!l||Vn)return"";Vn=!0;var e=Error.prepareStackTrace;Error.prepareStackTrace=void 0;try{var a={DetermineComponentFrameRoot:function(){try{if(t){var A=function(){throw Error()};if(Object.defineProperty(A.prototype,"props",{set:function(){throw Error()}}),typeof Reflect=="object"&&Reflect.construct){try{Reflect.construct(A,[])}catch(S){var h=S}Reflect.construct(l,[],A)}else{try{A.call()}catch(S){h=S}l.call(A.prototype)}}else{try{throw Error()}catch(S){h=S}(A=l())&&typeof A.catch=="function"&&A.catch(function(){})}}catch(S){if(S&&h&&typeof S.stack=="string")return[S.stack,h.stack]}return[null,null]}};a.DetermineComponentFrameRoot.displayName="DetermineComponentFrameRoot";var u=Object.getOwnPropertyDescriptor(a.DetermineComponentFrameRoot,"name");u&&u.configurable&&Object.defineProperty(a.DetermineComponentFrameRoot,"name",{value:"DetermineComponentFrameRoot"});var n=a.DetermineComponentFrameRoot(),c=n[0],i=n[1];if(c&&i){var f=c.split(`
+`+Vn+l+Sf}var Ln=!1;function Kn(l,t){if(!l||Ln)return"";Ln=!0;var e=Error.prepareStackTrace;Error.prepareStackTrace=void 0;try{var a={DetermineComponentFrameRoot:function(){try{if(t){var A=function(){throw Error()};if(Object.defineProperty(A.prototype,"props",{set:function(){throw Error()}}),typeof Reflect=="object"&&Reflect.construct){try{Reflect.construct(A,[])}catch(S){var v=S}Reflect.construct(l,[],A)}else{try{A.call()}catch(S){v=S}l.call(A.prototype)}}else{try{throw Error()}catch(S){v=S}(A=l())&&typeof A.catch=="function"&&A.catch(function(){})}}catch(S){if(S&&v&&typeof S.stack=="string")return[S.stack,v.stack]}return[null,null]}};a.DetermineComponentFrameRoot.displayName="DetermineComponentFrameRoot";var u=Object.getOwnPropertyDescriptor(a.DetermineComponentFrameRoot,"name");u&&u.configurable&&Object.defineProperty(a.DetermineComponentFrameRoot,"name",{value:"DetermineComponentFrameRoot"});var n=a.DetermineComponentFrameRoot(),c=n[0],i=n[1];if(c&&i){var f=c.split(`
 `),r=i.split(`
 `);for(u=a=0;a<f.length&&!f[a].includes("DetermineComponentFrameRoot");)a++;for(;u<r.length&&!r[u].includes("DetermineComponentFrameRoot");)u++;if(a===f.length||u===r.length)for(a=f.length-1,u=r.length-1;1<=a&&0<=u&&f[a]!==r[u];)u--;for(;1<=a&&0<=u;a--,u--)if(f[a]!==r[u]){if(a!==1||u!==1)do if(a--,u--,0>u||f[a]!==r[u]){var b=`
-`+f[a].replace(" at new "," at ");return l.displayName&&b.includes("<anonymous>")&&(b=b.replace("<anonymous>",l.displayName)),b}while(1<=a&&0<=u);break}}}finally{Vn=!1,Error.prepareStackTrace=e}return(e=l?l.displayName||l.name:"")?Ae(e):""}function jo(l,t){switch(l.tag){case 26:case 27:case 5:return Ae(l.type);case 16:return Ae("Lazy");case 13:return l.child!==t&&t!==null?Ae("Suspense Fallback"):Ae("Suspense");case 19:return Ae("SuspenseList");case 0:case 15:return Ln(l.type,!1);case 11:return Ln(l.type.render,!1);case 1:return Ln(l.type,!0);case 31:return Ae("Activity");default:return""}}function Sf(l){try{var t="",e=null;do t+=jo(l,e),e=l,l=l.return;while(l);return t}catch(a){return`
+`+f[a].replace(" at new "," at ");return l.displayName&&b.includes("<anonymous>")&&(b=b.replace("<anonymous>",l.displayName)),b}while(1<=a&&0<=u);break}}}finally{Ln=!1,Error.prepareStackTrace=e}return(e=l?l.displayName||l.name:"")?Ae(e):""}function Bo(l,t){switch(l.tag){case 26:case 27:case 5:return Ae(l.type);case 16:return Ae("Lazy");case 13:return l.child!==t&&t!==null?Ae("Suspense Fallback"):Ae("Suspense");case 19:return Ae("SuspenseList");case 0:case 15:return Kn(l.type,!1);case 11:return Kn(l.type.render,!1);case 1:return Kn(l.type,!0);case 31:return Ae("Activity");default:return""}}function bf(l){try{var t="",e=null;do t+=Bo(l,e),e=l,l=l.return;while(l);return t}catch(a){return`
 Error generating stack: `+a.message+`
-`+a.stack}}var Kn=Object.prototype.hasOwnProperty,Jn=g.unstable_scheduleCallback,wn=g.unstable_cancelCallback,Ho=g.unstable_shouldYield,Bo=g.unstable_requestPaint,nt=g.unstable_now,qo=g.unstable_getCurrentPriorityLevel,bf=g.unstable_ImmediatePriority,zf=g.unstable_UserBlockingPriority,Au=g.unstable_NormalPriority,Yo=g.unstable_LowPriority,Ef=g.unstable_IdlePriority,Go=g.log,Qo=g.unstable_setDisableYieldValue,Ma=null,ct=null;function It(l){if(typeof Go=="function"&&Qo(l),ct&&typeof ct.setStrictMode=="function")try{ct.setStrictMode(Ma,l)}catch{}}var it=Math.clz32?Math.clz32:Vo,Xo=Math.log,Zo=Math.LN2;function Vo(l){return l>>>=0,l===0?32:31-(Xo(l)/Zo|0)|0}var pu=256,_u=262144,Ou=4194304;function pe(l){var t=l&42;if(t!==0)return t;switch(l&-l){case 1:return 1;case 2:return 2;case 4:return 4;case 8:return 8;case 16:return 16;case 32:return 32;case 64:return 64;case 128:return 128;case 256:case 512:case 1024:case 2048:case 4096:case 8192:case 16384:case 32768:case 65536:case 131072:return l&261888;case 262144:case 524288:case 1048576:case 2097152:return l&3932160;case 4194304:case 8388608:case 16777216:case 33554432:return l&62914560;case 67108864:return 67108864;case 134217728:return 134217728;case 268435456:return 268435456;case 536870912:return 536870912;case 1073741824:return 0;default:return l}}function Mu(l,t,e){var a=l.pendingLanes;if(a===0)return 0;var u=0,n=l.suspendedLanes,c=l.pingedLanes;l=l.warmLanes;var i=a&134217727;return i!==0?(a=i&~n,a!==0?u=pe(a):(c&=i,c!==0?u=pe(c):e||(e=i&~l,e!==0&&(u=pe(e))))):(i=a&~n,i!==0?u=pe(i):c!==0?u=pe(c):e||(e=a&~l,e!==0&&(u=pe(e)))),u===0?0:t!==0&&t!==u&&(t&n)===0&&(n=u&-u,e=t&-t,n>=e||n===32&&(e&4194048)!==0)?t:u}function xa(l,t){return(l.pendingLanes&~(l.suspendedLanes&~l.pingedLanes)&t)===0}function Lo(l,t){switch(l){case 1:case 2:case 4:case 8:case 64:return t+250;case 16:case 32:case 128:case 256:case 512:case 1024:case 2048:case 4096:case 8192:case 16384:case 32768:case 65536:case 131072:case 262144:case 524288:case 1048576:case 2097152:return t+5e3;case 4194304:case 8388608:case 16777216:case 33554432:return-1;case 67108864:case 134217728:case 268435456:case 536870912:case 1073741824:return-1;default:return-1}}function Tf(){var l=Ou;return Ou<<=1,(Ou&62914560)===0&&(Ou=4194304),l}function Wn(l){for(var t=[],e=0;31>e;e++)t.push(l);return t}function Na(l,t){l.pendingLanes|=t,t!==268435456&&(l.suspendedLanes=0,l.pingedLanes=0,l.warmLanes=0)}function Ko(l,t,e,a,u,n){var c=l.pendingLanes;l.pendingLanes=e,l.suspendedLanes=0,l.pingedLanes=0,l.warmLanes=0,l.expiredLanes&=e,l.entangledLanes&=e,l.errorRecoveryDisabledLanes&=e,l.shellSuspendCounter=0;var i=l.entanglements,f=l.expirationTimes,r=l.hiddenUpdates;for(e=c&~e;0<e;){var b=31-it(e),A=1<<b;i[b]=0,f[b]=-1;var h=r[b];if(h!==null)for(r[b]=null,b=0;b<h.length;b++){var S=h[b];S!==null&&(S.lane&=-536870913)}e&=~A}a!==0&&Af(l,a,0),n!==0&&u===0&&l.tag!==0&&(l.suspendedLanes|=n&~(c&~t))}function Af(l,t,e){l.pendingLanes|=t,l.suspendedLanes&=~t;var a=31-it(t);l.entangledLanes|=t,l.entanglements[a]=l.entanglements[a]|1073741824|e&261930}function pf(l,t){var e=l.entangledLanes|=t;for(l=l.entanglements;e;){var a=31-it(e),u=1<<a;u&t|l[a]&t&&(l[a]|=t),e&=~u}}function _f(l,t){var e=t&-t;return e=(e&42)!==0?1:$n(e),(e&(l.suspendedLanes|t))!==0?0:e}function $n(l){switch(l){case 2:l=1;break;case 8:l=4;break;case 32:l=16;break;case 256:case 512:case 1024:case 2048:case 4096:case 8192:case 16384:case 32768:case 65536:case 131072:case 262144:case 524288:case 1048576:case 2097152:case 4194304:case 8388608:case 16777216:case 33554432:l=128;break;case 268435456:l=134217728;break;default:l=0}return l}function kn(l){return l&=-l,2<l?8<l?(l&134217727)!==0?32:268435456:8:2}function Of(){var l=N.p;return l!==0?l:(l=window.event,l===void 0?32:mo(l.type))}function Mf(l,t){var e=N.p;try{return N.p=l,t()}finally{N.p=e}}var Pt=Math.random().toString(36).slice(2),Ql="__reactFiber$"+Pt,Fl="__reactProps$"+Pt,Ze="__reactContainer$"+Pt,Fn="__reactEvents$"+Pt,Jo="__reactListeners$"+Pt,wo="__reactHandles$"+Pt,xf="__reactResources$"+Pt,Da="__reactMarker$"+Pt;function In(l){delete l[Ql],delete l[Fl],delete l[Fn],delete l[Jo],delete l[wo]}function Ve(l){var t=l[Ql];if(t)return t;for(var e=l.parentNode;e;){if(t=e[Ze]||e[Ql]){if(e=t.alternate,t.child!==null||e!==null&&e.child!==null)for(l=$d(l);l!==null;){if(e=l[Ql])return e;l=$d(l)}return t}l=e,e=l.parentNode}return null}function Le(l){if(l=l[Ql]||l[Ze]){var t=l.tag;if(t===5||t===6||t===13||t===31||t===26||t===27||t===3)return l}return null}function Ua(l){var t=l.tag;if(t===5||t===26||t===27||t===6)return l.stateNode;throw Error(o(33))}function Ke(l){var t=l[xf];return t||(t=l[xf]={hoistableStyles:new Map,hoistableScripts:new Map}),t}function ql(l){l[Da]=!0}var Nf=new Set,Df={};function _e(l,t){Je(l,t),Je(l+"Capture",t)}function Je(l,t){for(Df[l]=t,l=0;l<t.length;l++)Nf.add(t[l])}var Wo=RegExp("^[:A-Z_a-z\\u00C0-\\u00D6\\u00D8-\\u00F6\\u00F8-\\u02FF\\u0370-\\u037D\\u037F-\\u1FFF\\u200C-\\u200D\\u2070-\\u218F\\u2C00-\\u2FEF\\u3001-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFFD][:A-Z_a-z\\u00C0-\\u00D6\\u00D8-\\u00F6\\u00F8-\\u02FF\\u0370-\\u037D\\u037F-\\u1FFF\\u200C-\\u200D\\u2070-\\u218F\\u2C00-\\u2FEF\\u3001-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFFD\\-.0-9\\u00B7\\u0300-\\u036F\\u203F-\\u2040]*$"),Uf={},Cf={};function $o(l){return Kn.call(Cf,l)?!0:Kn.call(Uf,l)?!1:Wo.test(l)?Cf[l]=!0:(Uf[l]=!0,!1)}function xu(l,t,e){if($o(t))if(e===null)l.removeAttribute(t);else{switch(typeof e){case"undefined":case"function":case"symbol":l.removeAttribute(t);return;case"boolean":var a=t.toLowerCase().slice(0,5);if(a!=="data-"&&a!=="aria-"){l.removeAttribute(t);return}}l.setAttribute(t,""+e)}}function Nu(l,t,e){if(e===null)l.removeAttribute(t);else{switch(typeof e){case"undefined":case"function":case"symbol":case"boolean":l.removeAttribute(t);return}l.setAttribute(t,""+e)}}function jt(l,t,e,a){if(a===null)l.removeAttribute(e);else{switch(typeof a){case"undefined":case"function":case"symbol":case"boolean":l.removeAttribute(e);return}l.setAttributeNS(t,e,""+a)}}function ht(l){switch(typeof l){case"bigint":case"boolean":case"number":case"string":case"undefined":return l;case"object":return l;default:return""}}function Rf(l){var t=l.type;return(l=l.nodeName)&&l.toLowerCase()==="input"&&(t==="checkbox"||t==="radio")}function ko(l,t,e){var a=Object.getOwnPropertyDescriptor(l.constructor.prototype,t);if(!l.hasOwnProperty(t)&&typeof a<"u"&&typeof a.get=="function"&&typeof a.set=="function"){var u=a.get,n=a.set;return Object.defineProperty(l,t,{configurable:!0,get:function(){return u.call(this)},set:function(c){e=""+c,n.call(this,c)}}),Object.defineProperty(l,t,{enumerable:a.enumerable}),{getValue:function(){return e},setValue:function(c){e=""+c},stopTracking:function(){l._valueTracker=null,delete l[t]}}}}function Pn(l){if(!l._valueTracker){var t=Rf(l)?"checked":"value";l._valueTracker=ko(l,t,""+l[t])}}function jf(l){if(!l)return!1;var t=l._valueTracker;if(!t)return!0;var e=t.getValue(),a="";return l&&(a=Rf(l)?l.checked?"true":"false":l.value),l=a,l!==e?(t.setValue(l),!0):!1}function Du(l){if(l=l||(typeof document<"u"?document:void 0),typeof l>"u")return null;try{return l.activeElement||l.body}catch{return l.body}}var Fo=/[\n"\\]/g;function vt(l){return l.replace(Fo,function(t){return"\\"+t.charCodeAt(0).toString(16)+" "})}function lc(l,t,e,a,u,n,c,i){l.name="",c!=null&&typeof c!="function"&&typeof c!="symbol"&&typeof c!="boolean"?l.type=c:l.removeAttribute("type"),t!=null?c==="number"?(t===0&&l.value===""||l.value!=t)&&(l.value=""+ht(t)):l.value!==""+ht(t)&&(l.value=""+ht(t)):c!=="submit"&&c!=="reset"||l.removeAttribute("value"),t!=null?tc(l,c,ht(t)):e!=null?tc(l,c,ht(e)):a!=null&&l.removeAttribute("value"),u==null&&n!=null&&(l.defaultChecked=!!n),u!=null&&(l.checked=u&&typeof u!="function"&&typeof u!="symbol"),i!=null&&typeof i!="function"&&typeof i!="symbol"&&typeof i!="boolean"?l.name=""+ht(i):l.removeAttribute("name")}function Hf(l,t,e,a,u,n,c,i){if(n!=null&&typeof n!="function"&&typeof n!="symbol"&&typeof n!="boolean"&&(l.type=n),t!=null||e!=null){if(!(n!=="submit"&&n!=="reset"||t!=null)){Pn(l);return}e=e!=null?""+ht(e):"",t=t!=null?""+ht(t):e,i||t===l.value||(l.value=t),l.defaultValue=t}a=a??u,a=typeof a!="function"&&typeof a!="symbol"&&!!a,l.checked=i?l.checked:!!a,l.defaultChecked=!!a,c!=null&&typeof c!="function"&&typeof c!="symbol"&&typeof c!="boolean"&&(l.name=c),Pn(l)}function tc(l,t,e){t==="number"&&Du(l.ownerDocument)===l||l.defaultValue===""+e||(l.defaultValue=""+e)}function we(l,t,e,a){if(l=l.options,t){t={};for(var u=0;u<e.length;u++)t["$"+e[u]]=!0;for(e=0;e<l.length;e++)u=t.hasOwnProperty("$"+l[e].value),l[e].selected!==u&&(l[e].selected=u),u&&a&&(l[e].defaultSelected=!0)}else{for(e=""+ht(e),t=null,u=0;u<l.length;u++){if(l[u].value===e){l[u].selected=!0,a&&(l[u].defaultSelected=!0);return}t!==null||l[u].disabled||(t=l[u])}t!==null&&(t.selected=!0)}}function Bf(l,t,e){if(t!=null&&(t=""+ht(t),t!==l.value&&(l.value=t),e==null)){l.defaultValue!==t&&(l.defaultValue=t);return}l.defaultValue=e!=null?""+ht(e):""}function qf(l,t,e,a){if(t==null){if(a!=null){if(e!=null)throw Error(o(92));if(ut(a)){if(1<a.length)throw Error(o(93));a=a[0]}e=a}e==null&&(e=""),t=e}e=ht(t),l.defaultValue=e,a=l.textContent,a===e&&a!==""&&a!==null&&(l.value=a),Pn(l)}function We(l,t){if(t){var e=l.firstChild;if(e&&e===l.lastChild&&e.nodeType===3){e.nodeValue=t;return}}l.textContent=t}var Io=new Set("animationIterationCount aspectRatio borderImageOutset borderImageSlice borderImageWidth boxFlex boxFlexGroup boxOrdinalGroup columnCount columns flex flexGrow flexPositive flexShrink flexNegative flexOrder gridArea gridRow gridRowEnd gridRowSpan gridRowStart gridColumn gridColumnEnd gridColumnSpan gridColumnStart fontWeight lineClamp lineHeight opacity order orphans scale tabSize widows zIndex zoom fillOpacity floodOpacity stopOpacity strokeDasharray strokeDashoffset strokeMiterlimit strokeOpacity strokeWidth MozAnimationIterationCount MozBoxFlex MozBoxFlexGroup MozLineClamp msAnimationIterationCount msFlex msZoom msFlexGrow msFlexNegative msFlexOrder msFlexPositive msFlexShrink msGridColumn msGridColumnSpan msGridRow msGridRowSpan WebkitAnimationIterationCount WebkitBoxFlex WebKitBoxFlexGroup WebkitBoxOrdinalGroup WebkitColumnCount WebkitColumns WebkitFlex WebkitFlexGrow WebkitFlexPositive WebkitFlexShrink WebkitLineClamp".split(" "));function Yf(l,t,e){var a=t.indexOf("--")===0;e==null||typeof e=="boolean"||e===""?a?l.setProperty(t,""):t==="float"?l.cssFloat="":l[t]="":a?l.setProperty(t,e):typeof e!="number"||e===0||Io.has(t)?t==="float"?l.cssFloat=e:l[t]=(""+e).trim():l[t]=e+"px"}function Gf(l,t,e){if(t!=null&&typeof t!="object")throw Error(o(62));if(l=l.style,e!=null){for(var a in e)!e.hasOwnProperty(a)||t!=null&&t.hasOwnProperty(a)||(a.indexOf("--")===0?l.setProperty(a,""):a==="float"?l.cssFloat="":l[a]="");for(var u in t)a=t[u],t.hasOwnProperty(u)&&e[u]!==a&&Yf(l,u,a)}else for(var n in t)t.hasOwnProperty(n)&&Yf(l,n,t[n])}function ec(l){if(l.indexOf("-")===-1)return!1;switch(l){case"annotation-xml":case"color-profile":case"font-face":case"font-face-src":case"font-face-uri":case"font-face-format":case"font-face-name":case"missing-glyph":return!1;default:return!0}}var Po=new Map([["acceptCharset","accept-charset"],["htmlFor","for"],["httpEquiv","http-equiv"],["crossOrigin","crossorigin"],["accentHeight","accent-height"],["alignmentBaseline","alignment-baseline"],["arabicForm","arabic-form"],["baselineShift","baseline-shift"],["capHeight","cap-height"],["clipPath","clip-path"],["clipRule","clip-rule"],["colorInterpolation","color-interpolation"],["colorInterpolationFilters","color-interpolation-filters"],["colorProfile","color-profile"],["colorRendering","color-rendering"],["dominantBaseline","dominant-baseline"],["enableBackground","enable-background"],["fillOpacity","fill-opacity"],["fillRule","fill-rule"],["floodColor","flood-color"],["floodOpacity","flood-opacity"],["fontFamily","font-family"],["fontSize","font-size"],["fontSizeAdjust","font-size-adjust"],["fontStretch","font-stretch"],["fontStyle","font-style"],["fontVariant","font-variant"],["fontWeight","font-weight"],["glyphName","glyph-name"],["glyphOrientationHorizontal","glyph-orientation-horizontal"],["glyphOrientationVertical","glyph-orientation-vertical"],["horizAdvX","horiz-adv-x"],["horizOriginX","horiz-origin-x"],["imageRendering","image-rendering"],["letterSpacing","letter-spacing"],["lightingColor","lighting-color"],["markerEnd","marker-end"],["markerMid","marker-mid"],["markerStart","marker-start"],["overlinePosition","overline-position"],["overlineThickness","overline-thickness"],["paintOrder","paint-order"],["panose-1","panose-1"],["pointerEvents","pointer-events"],["renderingIntent","rendering-intent"],["shapeRendering","shape-rendering"],["stopColor","stop-color"],["stopOpacity","stop-opacity"],["strikethroughPosition","strikethrough-position"],["strikethroughThickness","strikethrough-thickness"],["strokeDasharray","stroke-dasharray"],["strokeDashoffset","stroke-dashoffset"],["strokeLinecap","stroke-linecap"],["strokeLinejoin","stroke-linejoin"],["strokeMiterlimit","stroke-miterlimit"],["strokeOpacity","stroke-opacity"],["strokeWidth","stroke-width"],["textAnchor","text-anchor"],["textDecoration","text-decoration"],["textRendering","text-rendering"],["transformOrigin","transform-origin"],["underlinePosition","underline-position"],["underlineThickness","underline-thickness"],["unicodeBidi","unicode-bidi"],["unicodeRange","unicode-range"],["unitsPerEm","units-per-em"],["vAlphabetic","v-alphabetic"],["vHanging","v-hanging"],["vIdeographic","v-ideographic"],["vMathematical","v-mathematical"],["vectorEffect","vector-effect"],["vertAdvY","vert-adv-y"],["vertOriginX","vert-origin-x"],["vertOriginY","vert-origin-y"],["wordSpacing","word-spacing"],["writingMode","writing-mode"],["xmlnsXlink","xmlns:xlink"],["xHeight","x-height"]]),lm=/^[\u0000-\u001F ]*j[\r\n\t]*a[\r\n\t]*v[\r\n\t]*a[\r\n\t]*s[\r\n\t]*c[\r\n\t]*r[\r\n\t]*i[\r\n\t]*p[\r\n\t]*t[\r\n\t]*:/i;function Uu(l){return lm.test(""+l)?"javascript:throw new Error('React has blocked a javascript: URL as a security precaution.')":l}function Ht(){}var ac=null;function uc(l){return l=l.target||l.srcElement||window,l.correspondingUseElement&&(l=l.correspondingUseElement),l.nodeType===3?l.parentNode:l}var $e=null,ke=null;function Qf(l){var t=Le(l);if(t&&(l=t.stateNode)){var e=l[Fl]||null;l:switch(l=t.stateNode,t.type){case"input":if(lc(l,e.value,e.defaultValue,e.defaultValue,e.checked,e.defaultChecked,e.type,e.name),t=e.name,e.type==="radio"&&t!=null){for(e=l;e.parentNode;)e=e.parentNode;for(e=e.querySelectorAll('input[name="'+vt(""+t)+'"][type="radio"]'),t=0;t<e.length;t++){var a=e[t];if(a!==l&&a.form===l.form){var u=a[Fl]||null;if(!u)throw Error(o(90));lc(a,u.value,u.defaultValue,u.defaultValue,u.checked,u.defaultChecked,u.type,u.name)}}for(t=0;t<e.length;t++)a=e[t],a.form===l.form&&jf(a)}break l;case"textarea":Bf(l,e.value,e.defaultValue);break l;case"select":t=e.value,t!=null&&we(l,!!e.multiple,t,!1)}}}var nc=!1;function Xf(l,t,e){if(nc)return l(t,e);nc=!0;try{var a=l(t);return a}finally{if(nc=!1,($e!==null||ke!==null)&&(bn(),$e&&(t=$e,l=ke,ke=$e=null,Qf(t),l)))for(t=0;t<l.length;t++)Qf(l[t])}}function Ca(l,t){var e=l.stateNode;if(e===null)return null;var a=e[Fl]||null;if(a===null)return null;e=a[t];l:switch(t){case"onClick":case"onClickCapture":case"onDoubleClick":case"onDoubleClickCapture":case"onMouseDown":case"onMouseDownCapture":case"onMouseMove":case"onMouseMoveCapture":case"onMouseUp":case"onMouseUpCapture":case"onMouseEnter":(a=!a.disabled)||(l=l.type,a=!(l==="button"||l==="input"||l==="select"||l==="textarea")),l=!a;break l;default:l=!1}if(l)return null;if(e&&typeof e!="function")throw Error(o(231,t,typeof e));return e}var Bt=!(typeof window>"u"||typeof window.document>"u"||typeof window.document.createElement>"u"),cc=!1;if(Bt)try{var Ra={};Object.defineProperty(Ra,"passive",{get:function(){cc=!0}}),window.addEventListener("test",Ra,Ra),window.removeEventListener("test",Ra,Ra)}catch{cc=!1}var le=null,ic=null,Cu=null;function Zf(){if(Cu)return Cu;var l,t=ic,e=t.length,a,u="value"in le?le.value:le.textContent,n=u.length;for(l=0;l<e&&t[l]===u[l];l++);var c=e-l;for(a=1;a<=c&&t[e-a]===u[n-a];a++);return Cu=u.slice(l,1<a?1-a:void 0)}function Ru(l){var t=l.keyCode;return"charCode"in l?(l=l.charCode,l===0&&t===13&&(l=13)):l=t,l===10&&(l=13),32<=l||l===13?l:0}function ju(){return!0}function Vf(){return!1}function Il(l){function t(e,a,u,n,c){this._reactName=e,this._targetInst=u,this.type=a,this.nativeEvent=n,this.target=c,this.currentTarget=null;for(var i in l)l.hasOwnProperty(i)&&(e=l[i],this[i]=e?e(n):n[i]);return this.isDefaultPrevented=(n.defaultPrevented!=null?n.defaultPrevented:n.returnValue===!1)?ju:Vf,this.isPropagationStopped=Vf,this}return U(t.prototype,{preventDefault:function(){this.defaultPrevented=!0;var e=this.nativeEvent;e&&(e.preventDefault?e.preventDefault():typeof e.returnValue!="unknown"&&(e.returnValue=!1),this.isDefaultPrevented=ju)},stopPropagation:function(){var e=this.nativeEvent;e&&(e.stopPropagation?e.stopPropagation():typeof e.cancelBubble!="unknown"&&(e.cancelBubble=!0),this.isPropagationStopped=ju)},persist:function(){},isPersistent:ju}),t}var Oe={eventPhase:0,bubbles:0,cancelable:0,timeStamp:function(l){return l.timeStamp||Date.now()},defaultPrevented:0,isTrusted:0},Hu=Il(Oe),ja=U({},Oe,{view:0,detail:0}),tm=Il(ja),fc,sc,Ha,Bu=U({},ja,{screenX:0,screenY:0,clientX:0,clientY:0,pageX:0,pageY:0,ctrlKey:0,shiftKey:0,altKey:0,metaKey:0,getModifierState:oc,button:0,buttons:0,relatedTarget:function(l){return l.relatedTarget===void 0?l.fromElement===l.srcElement?l.toElement:l.fromElement:l.relatedTarget},movementX:function(l){return"movementX"in l?l.movementX:(l!==Ha&&(Ha&&l.type==="mousemove"?(fc=l.screenX-Ha.screenX,sc=l.screenY-Ha.screenY):sc=fc=0,Ha=l),fc)},movementY:function(l){return"movementY"in l?l.movementY:sc}}),Lf=Il(Bu),em=U({},Bu,{dataTransfer:0}),am=Il(em),um=U({},ja,{relatedTarget:0}),dc=Il(um),nm=U({},Oe,{animationName:0,elapsedTime:0,pseudoElement:0}),cm=Il(nm),im=U({},Oe,{clipboardData:function(l){return"clipboardData"in l?l.clipboardData:window.clipboardData}}),fm=Il(im),sm=U({},Oe,{data:0}),Kf=Il(sm),dm={Esc:"Escape",Spacebar:" ",Left:"ArrowLeft",Up:"ArrowUp",Right:"ArrowRight",Down:"ArrowDown",Del:"Delete",Win:"OS",Menu:"ContextMenu",Apps:"ContextMenu",Scroll:"ScrollLock",MozPrintableKey:"Unidentified"},om={8:"Backspace",9:"Tab",12:"Clear",13:"Enter",16:"Shift",17:"Control",18:"Alt",19:"Pause",20:"CapsLock",27:"Escape",32:" ",33:"PageUp",34:"PageDown",35:"End",36:"Home",37:"ArrowLeft",38:"ArrowUp",39:"ArrowRight",40:"ArrowDown",45:"Insert",46:"Delete",112:"F1",113:"F2",114:"F3",115:"F4",116:"F5",117:"F6",118:"F7",119:"F8",120:"F9",121:"F10",122:"F11",123:"F12",144:"NumLock",145:"ScrollLock",224:"Meta"},mm={Alt:"altKey",Control:"ctrlKey",Meta:"metaKey",Shift:"shiftKey"};function ym(l){var t=this.nativeEvent;return t.getModifierState?t.getModifierState(l):(l=mm[l])?!!t[l]:!1}function oc(){return ym}var rm=U({},ja,{key:function(l){if(l.key){var t=dm[l.key]||l.key;if(t!=="Unidentified")return t}return l.type==="keypress"?(l=Ru(l),l===13?"Enter":String.fromCharCode(l)):l.type==="keydown"||l.type==="keyup"?om[l.keyCode]||"Unidentified":""},code:0,location:0,ctrlKey:0,shiftKey:0,altKey:0,metaKey:0,repeat:0,locale:0,getModifierState:oc,charCode:function(l){return l.type==="keypress"?Ru(l):0},keyCode:function(l){return l.type==="keydown"||l.type==="keyup"?l.keyCode:0},which:function(l){return l.type==="keypress"?Ru(l):l.type==="keydown"||l.type==="keyup"?l.keyCode:0}}),hm=Il(rm),vm=U({},Bu,{pointerId:0,width:0,height:0,pressure:0,tangentialPressure:0,tiltX:0,tiltY:0,twist:0,pointerType:0,isPrimary:0}),Jf=Il(vm),gm=U({},ja,{touches:0,targetTouches:0,changedTouches:0,altKey:0,metaKey:0,ctrlKey:0,shiftKey:0,getModifierState:oc}),Sm=Il(gm),bm=U({},Oe,{propertyName:0,elapsedTime:0,pseudoElement:0}),zm=Il(bm),Em=U({},Bu,{deltaX:function(l){return"deltaX"in l?l.deltaX:"wheelDeltaX"in l?-l.wheelDeltaX:0},deltaY:function(l){return"deltaY"in l?l.deltaY:"wheelDeltaY"in l?-l.wheelDeltaY:"wheelDelta"in l?-l.wheelDelta:0},deltaZ:0,deltaMode:0}),Tm=Il(Em),Am=U({},Oe,{newState:0,oldState:0}),pm=Il(Am),_m=[9,13,27,32],mc=Bt&&"CompositionEvent"in window,Ba=null;Bt&&"documentMode"in document&&(Ba=document.documentMode);var Om=Bt&&"TextEvent"in window&&!Ba,wf=Bt&&(!mc||Ba&&8<Ba&&11>=Ba),Wf=" ",$f=!1;function kf(l,t){switch(l){case"keyup":return _m.indexOf(t.keyCode)!==-1;case"keydown":return t.keyCode!==229;case"keypress":case"mousedown":case"focusout":return!0;default:return!1}}function Ff(l){return l=l.detail,typeof l=="object"&&"data"in l?l.data:null}var Fe=!1;function Mm(l,t){switch(l){case"compositionend":return Ff(t);case"keypress":return t.which!==32?null:($f=!0,Wf);case"textInput":return l=t.data,l===Wf&&$f?null:l;default:return null}}function xm(l,t){if(Fe)return l==="compositionend"||!mc&&kf(l,t)?(l=Zf(),Cu=ic=le=null,Fe=!1,l):null;switch(l){case"paste":return null;case"keypress":if(!(t.ctrlKey||t.altKey||t.metaKey)||t.ctrlKey&&t.altKey){if(t.char&&1<t.char.length)return t.char;if(t.which)return String.fromCharCode(t.which)}return null;case"compositionend":return wf&&t.locale!=="ko"?null:t.data;default:return null}}var Nm={color:!0,date:!0,datetime:!0,"datetime-local":!0,email:!0,month:!0,number:!0,password:!0,range:!0,search:!0,tel:!0,text:!0,time:!0,url:!0,week:!0};function If(l){var t=l&&l.nodeName&&l.nodeName.toLowerCase();return t==="input"?!!Nm[l.type]:t==="textarea"}function Pf(l,t,e,a){$e?ke?ke.push(a):ke=[a]:$e=a,t=On(t,"onChange"),0<t.length&&(e=new Hu("onChange","change",null,e,a),l.push({event:e,listeners:t}))}var qa=null,Ya=null;function Dm(l){Hd(l,0)}function qu(l){var t=Ua(l);if(jf(t))return l}function ls(l,t){if(l==="change")return t}var ts=!1;if(Bt){var yc;if(Bt){var rc="oninput"in document;if(!rc){var es=document.createElement("div");es.setAttribute("oninput","return;"),rc=typeof es.oninput=="function"}yc=rc}else yc=!1;ts=yc&&(!document.documentMode||9<document.documentMode)}function as(){qa&&(qa.detachEvent("onpropertychange",us),Ya=qa=null)}function us(l){if(l.propertyName==="value"&&qu(Ya)){var t=[];Pf(t,Ya,l,uc(l)),Xf(Dm,t)}}function Um(l,t,e){l==="focusin"?(as(),qa=t,Ya=e,qa.attachEvent("onpropertychange",us)):l==="focusout"&&as()}function Cm(l){if(l==="selectionchange"||l==="keyup"||l==="keydown")return qu(Ya)}function Rm(l,t){if(l==="click")return qu(t)}function jm(l,t){if(l==="input"||l==="change")return qu(t)}function Hm(l,t){return l===t&&(l!==0||1/l===1/t)||l!==l&&t!==t}var ft=typeof Object.is=="function"?Object.is:Hm;function Ga(l,t){if(ft(l,t))return!0;if(typeof l!="object"||l===null||typeof t!="object"||t===null)return!1;var e=Object.keys(l),a=Object.keys(t);if(e.length!==a.length)return!1;for(a=0;a<e.length;a++){var u=e[a];if(!Kn.call(t,u)||!ft(l[u],t[u]))return!1}return!0}function ns(l){for(;l&&l.firstChild;)l=l.firstChild;return l}function cs(l,t){var e=ns(l);l=0;for(var a;e;){if(e.nodeType===3){if(a=l+e.textContent.length,l<=t&&a>=t)return{node:e,offset:t-l};l=a}l:{for(;e;){if(e.nextSibling){e=e.nextSibling;break l}e=e.parentNode}e=void 0}e=ns(e)}}function is(l,t){return l&&t?l===t?!0:l&&l.nodeType===3?!1:t&&t.nodeType===3?is(l,t.parentNode):"contains"in l?l.contains(t):l.compareDocumentPosition?!!(l.compareDocumentPosition(t)&16):!1:!1}function fs(l){l=l!=null&&l.ownerDocument!=null&&l.ownerDocument.defaultView!=null?l.ownerDocument.defaultView:window;for(var t=Du(l.document);t instanceof l.HTMLIFrameElement;){try{var e=typeof t.contentWindow.location.href=="string"}catch{e=!1}if(e)l=t.contentWindow;else break;t=Du(l.document)}return t}function hc(l){var t=l&&l.nodeName&&l.nodeName.toLowerCase();return t&&(t==="input"&&(l.type==="text"||l.type==="search"||l.type==="tel"||l.type==="url"||l.type==="password")||t==="textarea"||l.contentEditable==="true")}var Bm=Bt&&"documentMode"in document&&11>=document.documentMode,Ie=null,vc=null,Qa=null,gc=!1;function ss(l,t,e){var a=e.window===e?e.document:e.nodeType===9?e:e.ownerDocument;gc||Ie==null||Ie!==Du(a)||(a=Ie,"selectionStart"in a&&hc(a)?a={start:a.selectionStart,end:a.selectionEnd}:(a=(a.ownerDocument&&a.ownerDocument.defaultView||window).getSelection(),a={anchorNode:a.anchorNode,anchorOffset:a.anchorOffset,focusNode:a.focusNode,focusOffset:a.focusOffset}),Qa&&Ga(Qa,a)||(Qa=a,a=On(vc,"onSelect"),0<a.length&&(t=new Hu("onSelect","select",null,t,e),l.push({event:t,listeners:a}),t.target=Ie)))}function Me(l,t){var e={};return e[l.toLowerCase()]=t.toLowerCase(),e["Webkit"+l]="webkit"+t,e["Moz"+l]="moz"+t,e}var Pe={animationend:Me("Animation","AnimationEnd"),animationiteration:Me("Animation","AnimationIteration"),animationstart:Me("Animation","AnimationStart"),transitionrun:Me("Transition","TransitionRun"),transitionstart:Me("Transition","TransitionStart"),transitioncancel:Me("Transition","TransitionCancel"),transitionend:Me("Transition","TransitionEnd")},Sc={},ds={};Bt&&(ds=document.createElement("div").style,"AnimationEvent"in window||(delete Pe.animationend.animation,delete Pe.animationiteration.animation,delete Pe.animationstart.animation),"TransitionEvent"in window||delete Pe.transitionend.transition);function xe(l){if(Sc[l])return Sc[l];if(!Pe[l])return l;var t=Pe[l],e;for(e in t)if(t.hasOwnProperty(e)&&e in ds)return Sc[l]=t[e];return l}var os=xe("animationend"),ms=xe("animationiteration"),ys=xe("animationstart"),qm=xe("transitionrun"),Ym=xe("transitionstart"),Gm=xe("transitioncancel"),rs=xe("transitionend"),hs=new Map,bc="abort auxClick beforeToggle cancel canPlay canPlayThrough click close contextMenu copy cut drag dragEnd dragEnter dragExit dragLeave dragOver dragStart drop durationChange emptied encrypted ended error gotPointerCapture input invalid keyDown keyPress keyUp load loadedData loadedMetadata loadStart lostPointerCapture mouseDown mouseMove mouseOut mouseOver mouseUp paste pause play playing pointerCancel pointerDown pointerMove pointerOut pointerOver pointerUp progress rateChange reset resize seeked seeking stalled submit suspend timeUpdate touchCancel touchEnd touchStart volumeChange scroll toggle touchMove waiting wheel".split(" ");bc.push("scrollEnd");function Ot(l,t){hs.set(l,t),_e(t,[l])}var Yu=typeof reportError=="function"?reportError:function(l){if(typeof window=="object"&&typeof window.ErrorEvent=="function"){var t=new window.ErrorEvent("error",{bubbles:!0,cancelable:!0,message:typeof l=="object"&&l!==null&&typeof l.message=="string"?String(l.message):String(l),error:l});if(!window.dispatchEvent(t))return}else if(typeof process=="object"&&typeof process.emit=="function"){process.emit("uncaughtException",l);return}console.error(l)},gt=[],la=0,zc=0;function Gu(){for(var l=la,t=zc=la=0;t<l;){var e=gt[t];gt[t++]=null;var a=gt[t];gt[t++]=null;var u=gt[t];gt[t++]=null;var n=gt[t];if(gt[t++]=null,a!==null&&u!==null){var c=a.pending;c===null?u.next=u:(u.next=c.next,c.next=u),a.pending=u}n!==0&&vs(e,u,n)}}function Qu(l,t,e,a){gt[la++]=l,gt[la++]=t,gt[la++]=e,gt[la++]=a,zc|=a,l.lanes|=a,l=l.alternate,l!==null&&(l.lanes|=a)}function Ec(l,t,e,a){return Qu(l,t,e,a),Xu(l)}function Ne(l,t){return Qu(l,null,null,t),Xu(l)}function vs(l,t,e){l.lanes|=e;var a=l.alternate;a!==null&&(a.lanes|=e);for(var u=!1,n=l.return;n!==null;)n.childLanes|=e,a=n.alternate,a!==null&&(a.childLanes|=e),n.tag===22&&(l=n.stateNode,l===null||l._visibility&1||(u=!0)),l=n,n=n.return;return l.tag===3?(n=l.stateNode,u&&t!==null&&(u=31-it(e),l=n.hiddenUpdates,a=l[u],a===null?l[u]=[t]:a.push(t),t.lane=e|536870912),n):null}function Xu(l){if(50<fu)throw fu=0,Di=null,Error(o(185));for(var t=l.return;t!==null;)l=t,t=l.return;return l.tag===3?l.stateNode:null}var ta={};function Qm(l,t,e,a){this.tag=l,this.key=e,this.sibling=this.child=this.return=this.stateNode=this.type=this.elementType=null,this.index=0,this.refCleanup=this.ref=null,this.pendingProps=t,this.dependencies=this.memoizedState=this.updateQueue=this.memoizedProps=null,this.mode=a,this.subtreeFlags=this.flags=0,this.deletions=null,this.childLanes=this.lanes=0,this.alternate=null}function st(l,t,e,a){return new Qm(l,t,e,a)}function Tc(l){return l=l.prototype,!(!l||!l.isReactComponent)}function qt(l,t){var e=l.alternate;return e===null?(e=st(l.tag,t,l.key,l.mode),e.elementType=l.elementType,e.type=l.type,e.stateNode=l.stateNode,e.alternate=l,l.alternate=e):(e.pendingProps=t,e.type=l.type,e.flags=0,e.subtreeFlags=0,e.deletions=null),e.flags=l.flags&65011712,e.childLanes=l.childLanes,e.lanes=l.lanes,e.child=l.child,e.memoizedProps=l.memoizedProps,e.memoizedState=l.memoizedState,e.updateQueue=l.updateQueue,t=l.dependencies,e.dependencies=t===null?null:{lanes:t.lanes,firstContext:t.firstContext},e.sibling=l.sibling,e.index=l.index,e.ref=l.ref,e.refCleanup=l.refCleanup,e}function gs(l,t){l.flags&=65011714;var e=l.alternate;return e===null?(l.childLanes=0,l.lanes=t,l.child=null,l.subtreeFlags=0,l.memoizedProps=null,l.memoizedState=null,l.updateQueue=null,l.dependencies=null,l.stateNode=null):(l.childLanes=e.childLanes,l.lanes=e.lanes,l.child=e.child,l.subtreeFlags=0,l.deletions=null,l.memoizedProps=e.memoizedProps,l.memoizedState=e.memoizedState,l.updateQueue=e.updateQueue,l.type=e.type,t=e.dependencies,l.dependencies=t===null?null:{lanes:t.lanes,firstContext:t.firstContext}),l}function Zu(l,t,e,a,u,n){var c=0;if(a=l,typeof l=="function")Tc(l)&&(c=1);else if(typeof l=="string")c=Ky(l,e,D.current)?26:l==="html"||l==="head"||l==="body"?27:5;else l:switch(l){case Kl:return l=st(31,e,t,u),l.elementType=Kl,l.lanes=n,l;case vl:return De(e.children,u,n,t);case al:c=8,u|=24;break;case pl:return l=st(12,e,t,u|2),l.elementType=pl,l.lanes=n,l;case Gl:return l=st(13,e,t,u),l.elementType=Gl,l.lanes=n,l;case El:return l=st(19,e,t,u),l.elementType=El,l.lanes=n,l;default:if(typeof l=="object"&&l!==null)switch(l.$$typeof){case ul:c=10;break l;case $:c=9;break l;case nl:c=11;break l;case W:c=14;break l;case Hl:c=16,a=null;break l}c=29,e=Error(o(130,l===null?"null":typeof l,"")),a=null}return t=st(c,e,t,u),t.elementType=l,t.type=a,t.lanes=n,t}function De(l,t,e,a){return l=st(7,l,a,t),l.lanes=e,l}function Ac(l,t,e){return l=st(6,l,null,t),l.lanes=e,l}function Ss(l){var t=st(18,null,null,0);return t.stateNode=l,t}function pc(l,t,e){return t=st(4,l.children!==null?l.children:[],l.key,t),t.lanes=e,t.stateNode={containerInfo:l.containerInfo,pendingChildren:null,implementation:l.implementation},t}var bs=new WeakMap;function St(l,t){if(typeof l=="object"&&l!==null){var e=bs.get(l);return e!==void 0?e:(t={value:l,source:t,stack:Sf(t)},bs.set(l,t),t)}return{value:l,source:t,stack:Sf(t)}}var ea=[],aa=0,Vu=null,Xa=0,bt=[],zt=0,te=null,Nt=1,Dt="";function Yt(l,t){ea[aa++]=Xa,ea[aa++]=Vu,Vu=l,Xa=t}function zs(l,t,e){bt[zt++]=Nt,bt[zt++]=Dt,bt[zt++]=te,te=l;var a=Nt;l=Dt;var u=32-it(a)-1;a&=~(1<<u),e+=1;var n=32-it(t)+u;if(30<n){var c=u-u%5;n=(a&(1<<c)-1).toString(32),a>>=c,u-=c,Nt=1<<32-it(t)+u|e<<u|a,Dt=n+l}else Nt=1<<n|e<<u|a,Dt=l}function _c(l){l.return!==null&&(Yt(l,1),zs(l,1,0))}function Oc(l){for(;l===Vu;)Vu=ea[--aa],ea[aa]=null,Xa=ea[--aa],ea[aa]=null;for(;l===te;)te=bt[--zt],bt[zt]=null,Dt=bt[--zt],bt[zt]=null,Nt=bt[--zt],bt[zt]=null}function Es(l,t){bt[zt++]=Nt,bt[zt++]=Dt,bt[zt++]=te,Nt=t.id,Dt=t.overflow,te=l}var Xl=null,Tl=null,el=!1,ee=null,Et=!1,Mc=Error(o(519));function ae(l){var t=Error(o(418,1<arguments.length&&arguments[1]!==void 0&&arguments[1]?"text":"HTML",""));throw Za(St(t,l)),Mc}function Ts(l){var t=l.stateNode,e=l.type,a=l.memoizedProps;switch(t[Ql]=l,t[Fl]=a,e){case"dialog":F("cancel",t),F("close",t);break;case"iframe":case"object":case"embed":F("load",t);break;case"video":case"audio":for(e=0;e<du.length;e++)F(du[e],t);break;case"source":F("error",t);break;case"img":case"image":case"link":F("error",t),F("load",t);break;case"details":F("toggle",t);break;case"input":F("invalid",t),Hf(t,a.value,a.defaultValue,a.checked,a.defaultChecked,a.type,a.name,!0);break;case"select":F("invalid",t);break;case"textarea":F("invalid",t),qf(t,a.value,a.defaultValue,a.children)}e=a.children,typeof e!="string"&&typeof e!="number"&&typeof e!="bigint"||t.textContent===""+e||a.suppressHydrationWarning===!0||Gd(t.textContent,e)?(a.popover!=null&&(F("beforetoggle",t),F("toggle",t)),a.onScroll!=null&&F("scroll",t),a.onScrollEnd!=null&&F("scrollend",t),a.onClick!=null&&(t.onclick=Ht),t=!0):t=!1,t||ae(l,!0)}function As(l){for(Xl=l.return;Xl;)switch(Xl.tag){case 5:case 31:case 13:Et=!1;return;case 27:case 3:Et=!0;return;default:Xl=Xl.return}}function ua(l){if(l!==Xl)return!1;if(!el)return As(l),el=!0,!1;var t=l.tag,e;if((e=t!==3&&t!==27)&&((e=t===5)&&(e=l.type,e=!(e!=="form"&&e!=="button")||Ki(l.type,l.memoizedProps)),e=!e),e&&Tl&&ae(l),As(l),t===13){if(l=l.memoizedState,l=l!==null?l.dehydrated:null,!l)throw Error(o(317));Tl=Wd(l)}else if(t===31){if(l=l.memoizedState,l=l!==null?l.dehydrated:null,!l)throw Error(o(317));Tl=Wd(l)}else t===27?(t=Tl,ge(l.type)?(l=ki,ki=null,Tl=l):Tl=t):Tl=Xl?At(l.stateNode.nextSibling):null;return!0}function Ue(){Tl=Xl=null,el=!1}function xc(){var l=ee;return l!==null&&(et===null?et=l:et.push.apply(et,l),ee=null),l}function Za(l){ee===null?ee=[l]:ee.push(l)}var Nc=d(null),Ce=null,Gt=null;function ue(l,t,e){O(Nc,t._currentValue),t._currentValue=e}function Qt(l){l._currentValue=Nc.current,v(Nc)}function Dc(l,t,e){for(;l!==null;){var a=l.alternate;if((l.childLanes&t)!==t?(l.childLanes|=t,a!==null&&(a.childLanes|=t)):a!==null&&(a.childLanes&t)!==t&&(a.childLanes|=t),l===e)break;l=l.return}}function Uc(l,t,e,a){var u=l.child;for(u!==null&&(u.return=l);u!==null;){var n=u.dependencies;if(n!==null){var c=u.child;n=n.firstContext;l:for(;n!==null;){var i=n;n=u;for(var f=0;f<t.length;f++)if(i.context===t[f]){n.lanes|=e,i=n.alternate,i!==null&&(i.lanes|=e),Dc(n.return,e,l),a||(c=null);break l}n=i.next}}else if(u.tag===18){if(c=u.return,c===null)throw Error(o(341));c.lanes|=e,n=c.alternate,n!==null&&(n.lanes|=e),Dc(c,e,l),c=null}else c=u.child;if(c!==null)c.return=u;else for(c=u;c!==null;){if(c===l){c=null;break}if(u=c.sibling,u!==null){u.return=c.return,c=u;break}c=c.return}u=c}}function na(l,t,e,a){l=null;for(var u=t,n=!1;u!==null;){if(!n){if((u.flags&524288)!==0)n=!0;else if((u.flags&262144)!==0)break}if(u.tag===10){var c=u.alternate;if(c===null)throw Error(o(387));if(c=c.memoizedProps,c!==null){var i=u.type;ft(u.pendingProps.value,c.value)||(l!==null?l.push(i):l=[i])}}else if(u===ol.current){if(c=u.alternate,c===null)throw Error(o(387));c.memoizedState.memoizedState!==u.memoizedState.memoizedState&&(l!==null?l.push(hu):l=[hu])}u=u.return}l!==null&&Uc(t,l,e,a),t.flags|=262144}function Lu(l){for(l=l.firstContext;l!==null;){if(!ft(l.context._currentValue,l.memoizedValue))return!0;l=l.next}return!1}function Re(l){Ce=l,Gt=null,l=l.dependencies,l!==null&&(l.firstContext=null)}function Zl(l){return ps(Ce,l)}function Ku(l,t){return Ce===null&&Re(l),ps(l,t)}function ps(l,t){var e=t._currentValue;if(t={context:t,memoizedValue:e,next:null},Gt===null){if(l===null)throw Error(o(308));Gt=t,l.dependencies={lanes:0,firstContext:t},l.flags|=524288}else Gt=Gt.next=t;return e}var Xm=typeof AbortController<"u"?AbortController:function(){var l=[],t=this.signal={aborted:!1,addEventListener:function(e,a){l.push(a)}};this.abort=function(){t.aborted=!0,l.forEach(function(e){return e()})}},Zm=g.unstable_scheduleCallback,Vm=g.unstable_NormalPriority,Ul={$$typeof:ul,Consumer:null,Provider:null,_currentValue:null,_currentValue2:null,_threadCount:0};function Cc(){return{controller:new Xm,data:new Map,refCount:0}}function Va(l){l.refCount--,l.refCount===0&&Zm(Vm,function(){l.controller.abort()})}var La=null,Rc=0,ca=0,ia=null;function Lm(l,t){if(La===null){var e=La=[];Rc=0,ca=Bi(),ia={status:"pending",value:void 0,then:function(a){e.push(a)}}}return Rc++,t.then(_s,_s),t}function _s(){if(--Rc===0&&La!==null){ia!==null&&(ia.status="fulfilled");var l=La;La=null,ca=0,ia=null;for(var t=0;t<l.length;t++)(0,l[t])()}}function Km(l,t){var e=[],a={status:"pending",value:null,reason:null,then:function(u){e.push(u)}};return l.then(function(){a.status="fulfilled",a.value=t;for(var u=0;u<e.length;u++)(0,e[u])(t)},function(u){for(a.status="rejected",a.reason=u,u=0;u<e.length;u++)(0,e[u])(void 0)}),a}var Os=z.S;z.S=function(l,t){sd=nt(),typeof t=="object"&&t!==null&&typeof t.then=="function"&&Lm(l,t),Os!==null&&Os(l,t)};var je=d(null);function jc(){var l=je.current;return l!==null?l:zl.pooledCache}function Ju(l,t){t===null?O(je,je.current):O(je,t.pool)}function Ms(){var l=jc();return l===null?null:{parent:Ul._currentValue,pool:l}}var fa=Error(o(460)),Hc=Error(o(474)),wu=Error(o(542)),Wu={then:function(){}};function xs(l){return l=l.status,l==="fulfilled"||l==="rejected"}function Ns(l,t,e){switch(e=l[e],e===void 0?l.push(t):e!==t&&(t.then(Ht,Ht),t=e),t.status){case"fulfilled":return t.value;case"rejected":throw l=t.reason,Us(l),l;default:if(typeof t.status=="string")t.then(Ht,Ht);else{if(l=zl,l!==null&&100<l.shellSuspendCounter)throw Error(o(482));l=t,l.status="pending",l.then(function(a){if(t.status==="pending"){var u=t;u.status="fulfilled",u.value=a}},function(a){if(t.status==="pending"){var u=t;u.status="rejected",u.reason=a}})}switch(t.status){case"fulfilled":return t.value;case"rejected":throw l=t.reason,Us(l),l}throw Be=t,fa}}function He(l){try{var t=l._init;return t(l._payload)}catch(e){throw e!==null&&typeof e=="object"&&typeof e.then=="function"?(Be=e,fa):e}}var Be=null;function Ds(){if(Be===null)throw Error(o(459));var l=Be;return Be=null,l}function Us(l){if(l===fa||l===wu)throw Error(o(483))}var sa=null,Ka=0;function $u(l){var t=Ka;return Ka+=1,sa===null&&(sa=[]),Ns(sa,l,t)}function Ja(l,t){t=t.props.ref,l.ref=t!==void 0?t:null}function ku(l,t){throw t.$$typeof===x?Error(o(525)):(l=Object.prototype.toString.call(t),Error(o(31,l==="[object Object]"?"object with keys {"+Object.keys(t).join(", ")+"}":l)))}function Cs(l){function t(m,s){if(l){var y=m.deletions;y===null?(m.deletions=[s],m.flags|=16):y.push(s)}}function e(m,s){if(!l)return null;for(;s!==null;)t(m,s),s=s.sibling;return null}function a(m){for(var s=new Map;m!==null;)m.key!==null?s.set(m.key,m):s.set(m.index,m),m=m.sibling;return s}function u(m,s){return m=qt(m,s),m.index=0,m.sibling=null,m}function n(m,s,y){return m.index=y,l?(y=m.alternate,y!==null?(y=y.index,y<s?(m.flags|=67108866,s):y):(m.flags|=67108866,s)):(m.flags|=1048576,s)}function c(m){return l&&m.alternate===null&&(m.flags|=67108866),m}function i(m,s,y,T){return s===null||s.tag!==6?(s=Ac(y,m.mode,T),s.return=m,s):(s=u(s,y),s.return=m,s)}function f(m,s,y,T){var Q=y.type;return Q===vl?b(m,s,y.props.children,T,y.key):s!==null&&(s.elementType===Q||typeof Q=="object"&&Q!==null&&Q.$$typeof===Hl&&He(Q)===s.type)?(s=u(s,y.props),Ja(s,y),s.return=m,s):(s=Zu(y.type,y.key,y.props,null,m.mode,T),Ja(s,y),s.return=m,s)}function r(m,s,y,T){return s===null||s.tag!==4||s.stateNode.containerInfo!==y.containerInfo||s.stateNode.implementation!==y.implementation?(s=pc(y,m.mode,T),s.return=m,s):(s=u(s,y.children||[]),s.return=m,s)}function b(m,s,y,T,Q){return s===null||s.tag!==7?(s=De(y,m.mode,T,Q),s.return=m,s):(s=u(s,y),s.return=m,s)}function A(m,s,y){if(typeof s=="string"&&s!==""||typeof s=="number"||typeof s=="bigint")return s=Ac(""+s,m.mode,y),s.return=m,s;if(typeof s=="object"&&s!==null){switch(s.$$typeof){case tl:return y=Zu(s.type,s.key,s.props,null,m.mode,y),Ja(y,s),y.return=m,y;case I:return s=pc(s,m.mode,y),s.return=m,s;case Hl:return s=He(s),A(m,s,y)}if(ut(s)||fl(s))return s=De(s,m.mode,y,null),s.return=m,s;if(typeof s.then=="function")return A(m,$u(s),y);if(s.$$typeof===ul)return A(m,Ku(m,s),y);ku(m,s)}return null}function h(m,s,y,T){var Q=s!==null?s.key:null;if(typeof y=="string"&&y!==""||typeof y=="number"||typeof y=="bigint")return Q!==null?null:i(m,s,""+y,T);if(typeof y=="object"&&y!==null){switch(y.$$typeof){case tl:return y.key===Q?f(m,s,y,T):null;case I:return y.key===Q?r(m,s,y,T):null;case Hl:return y=He(y),h(m,s,y,T)}if(ut(y)||fl(y))return Q!==null?null:b(m,s,y,T,null);if(typeof y.then=="function")return h(m,s,$u(y),T);if(y.$$typeof===ul)return h(m,s,Ku(m,y),T);ku(m,y)}return null}function S(m,s,y,T,Q){if(typeof T=="string"&&T!==""||typeof T=="number"||typeof T=="bigint")return m=m.get(y)||null,i(s,m,""+T,Q);if(typeof T=="object"&&T!==null){switch(T.$$typeof){case tl:return m=m.get(T.key===null?y:T.key)||null,f(s,m,T,Q);case I:return m=m.get(T.key===null?y:T.key)||null,r(s,m,T,Q);case Hl:return T=He(T),S(m,s,y,T,Q)}if(ut(T)||fl(T))return m=m.get(y)||null,b(s,m,T,Q,null);if(typeof T.then=="function")return S(m,s,y,$u(T),Q);if(T.$$typeof===ul)return S(m,s,y,Ku(s,T),Q);ku(s,T)}return null}function j(m,s,y,T){for(var Q=null,cl=null,B=s,w=s=0,ll=null;B!==null&&w<y.length;w++){B.index>w?(ll=B,B=null):ll=B.sibling;var il=h(m,B,y[w],T);if(il===null){B===null&&(B=ll);break}l&&B&&il.alternate===null&&t(m,B),s=n(il,s,w),cl===null?Q=il:cl.sibling=il,cl=il,B=ll}if(w===y.length)return e(m,B),el&&Yt(m,w),Q;if(B===null){for(;w<y.length;w++)B=A(m,y[w],T),B!==null&&(s=n(B,s,w),cl===null?Q=B:cl.sibling=B,cl=B);return el&&Yt(m,w),Q}for(B=a(B);w<y.length;w++)ll=S(B,m,w,y[w],T),ll!==null&&(l&&ll.alternate!==null&&B.delete(ll.key===null?w:ll.key),s=n(ll,s,w),cl===null?Q=ll:cl.sibling=ll,cl=ll);return l&&B.forEach(function(Te){return t(m,Te)}),el&&Yt(m,w),Q}function Z(m,s,y,T){if(y==null)throw Error(o(151));for(var Q=null,cl=null,B=s,w=s=0,ll=null,il=y.next();B!==null&&!il.done;w++,il=y.next()){B.index>w?(ll=B,B=null):ll=B.sibling;var Te=h(m,B,il.value,T);if(Te===null){B===null&&(B=ll);break}l&&B&&Te.alternate===null&&t(m,B),s=n(Te,s,w),cl===null?Q=Te:cl.sibling=Te,cl=Te,B=ll}if(il.done)return e(m,B),el&&Yt(m,w),Q;if(B===null){for(;!il.done;w++,il=y.next())il=A(m,il.value,T),il!==null&&(s=n(il,s,w),cl===null?Q=il:cl.sibling=il,cl=il);return el&&Yt(m,w),Q}for(B=a(B);!il.done;w++,il=y.next())il=S(B,m,w,il.value,T),il!==null&&(l&&il.alternate!==null&&B.delete(il.key===null?w:il.key),s=n(il,s,w),cl===null?Q=il:cl.sibling=il,cl=il);return l&&B.forEach(function(er){return t(m,er)}),el&&Yt(m,w),Q}function bl(m,s,y,T){if(typeof y=="object"&&y!==null&&y.type===vl&&y.key===null&&(y=y.props.children),typeof y=="object"&&y!==null){switch(y.$$typeof){case tl:l:{for(var Q=y.key;s!==null;){if(s.key===Q){if(Q=y.type,Q===vl){if(s.tag===7){e(m,s.sibling),T=u(s,y.props.children),T.return=m,m=T;break l}}else if(s.elementType===Q||typeof Q=="object"&&Q!==null&&Q.$$typeof===Hl&&He(Q)===s.type){e(m,s.sibling),T=u(s,y.props),Ja(T,y),T.return=m,m=T;break l}e(m,s);break}else t(m,s);s=s.sibling}y.type===vl?(T=De(y.props.children,m.mode,T,y.key),T.return=m,m=T):(T=Zu(y.type,y.key,y.props,null,m.mode,T),Ja(T,y),T.return=m,m=T)}return c(m);case I:l:{for(Q=y.key;s!==null;){if(s.key===Q)if(s.tag===4&&s.stateNode.containerInfo===y.containerInfo&&s.stateNode.implementation===y.implementation){e(m,s.sibling),T=u(s,y.children||[]),T.return=m,m=T;break l}else{e(m,s);break}else t(m,s);s=s.sibling}T=pc(y,m.mode,T),T.return=m,m=T}return c(m);case Hl:return y=He(y),bl(m,s,y,T)}if(ut(y))return j(m,s,y,T);if(fl(y)){if(Q=fl(y),typeof Q!="function")throw Error(o(150));return y=Q.call(y),Z(m,s,y,T)}if(typeof y.then=="function")return bl(m,s,$u(y),T);if(y.$$typeof===ul)return bl(m,s,Ku(m,y),T);ku(m,y)}return typeof y=="string"&&y!==""||typeof y=="number"||typeof y=="bigint"?(y=""+y,s!==null&&s.tag===6?(e(m,s.sibling),T=u(s,y),T.return=m,m=T):(e(m,s),T=Ac(y,m.mode,T),T.return=m,m=T),c(m)):e(m,s)}return function(m,s,y,T){try{Ka=0;var Q=bl(m,s,y,T);return sa=null,Q}catch(B){if(B===fa||B===wu)throw B;var cl=st(29,B,null,m.mode);return cl.lanes=T,cl.return=m,cl}}}var qe=Cs(!0),Rs=Cs(!1),ne=!1;function Bc(l){l.updateQueue={baseState:l.memoizedState,firstBaseUpdate:null,lastBaseUpdate:null,shared:{pending:null,lanes:0,hiddenCallbacks:null},callbacks:null}}function qc(l,t){l=l.updateQueue,t.updateQueue===l&&(t.updateQueue={baseState:l.baseState,firstBaseUpdate:l.firstBaseUpdate,lastBaseUpdate:l.lastBaseUpdate,shared:l.shared,callbacks:null})}function ce(l){return{lane:l,tag:0,payload:null,callback:null,next:null}}function ie(l,t,e){var a=l.updateQueue;if(a===null)return null;if(a=a.shared,(sl&2)!==0){var u=a.pending;return u===null?t.next=t:(t.next=u.next,u.next=t),a.pending=t,t=Xu(l),vs(l,null,e),t}return Qu(l,a,t,e),Xu(l)}function wa(l,t,e){if(t=t.updateQueue,t!==null&&(t=t.shared,(e&4194048)!==0)){var a=t.lanes;a&=l.pendingLanes,e|=a,t.lanes=e,pf(l,e)}}function Yc(l,t){var e=l.updateQueue,a=l.alternate;if(a!==null&&(a=a.updateQueue,e===a)){var u=null,n=null;if(e=e.firstBaseUpdate,e!==null){do{var c={lane:e.lane,tag:e.tag,payload:e.payload,callback:null,next:null};n===null?u=n=c:n=n.next=c,e=e.next}while(e!==null);n===null?u=n=t:n=n.next=t}else u=n=t;e={baseState:a.baseState,firstBaseUpdate:u,lastBaseUpdate:n,shared:a.shared,callbacks:a.callbacks},l.updateQueue=e;return}l=e.lastBaseUpdate,l===null?e.firstBaseUpdate=t:l.next=t,e.lastBaseUpdate=t}var Gc=!1;function Wa(){if(Gc){var l=ia;if(l!==null)throw l}}function $a(l,t,e,a){Gc=!1;var u=l.updateQueue;ne=!1;var n=u.firstBaseUpdate,c=u.lastBaseUpdate,i=u.shared.pending;if(i!==null){u.shared.pending=null;var f=i,r=f.next;f.next=null,c===null?n=r:c.next=r,c=f;var b=l.alternate;b!==null&&(b=b.updateQueue,i=b.lastBaseUpdate,i!==c&&(i===null?b.firstBaseUpdate=r:i.next=r,b.lastBaseUpdate=f))}if(n!==null){var A=u.baseState;c=0,b=r=f=null,i=n;do{var h=i.lane&-536870913,S=h!==i.lane;if(S?(P&h)===h:(a&h)===h){h!==0&&h===ca&&(Gc=!0),b!==null&&(b=b.next={lane:0,tag:i.tag,payload:i.payload,callback:null,next:null});l:{var j=l,Z=i;h=t;var bl=e;switch(Z.tag){case 1:if(j=Z.payload,typeof j=="function"){A=j.call(bl,A,h);break l}A=j;break l;case 3:j.flags=j.flags&-65537|128;case 0:if(j=Z.payload,h=typeof j=="function"?j.call(bl,A,h):j,h==null)break l;A=U({},A,h);break l;case 2:ne=!0}}h=i.callback,h!==null&&(l.flags|=64,S&&(l.flags|=8192),S=u.callbacks,S===null?u.callbacks=[h]:S.push(h))}else S={lane:h,tag:i.tag,payload:i.payload,callback:i.callback,next:null},b===null?(r=b=S,f=A):b=b.next=S,c|=h;if(i=i.next,i===null){if(i=u.shared.pending,i===null)break;S=i,i=S.next,S.next=null,u.lastBaseUpdate=S,u.shared.pending=null}}while(!0);b===null&&(f=A),u.baseState=f,u.firstBaseUpdate=r,u.lastBaseUpdate=b,n===null&&(u.shared.lanes=0),me|=c,l.lanes=c,l.memoizedState=A}}function js(l,t){if(typeof l!="function")throw Error(o(191,l));l.call(t)}function Hs(l,t){var e=l.callbacks;if(e!==null)for(l.callbacks=null,l=0;l<e.length;l++)js(e[l],t)}var da=d(null),Fu=d(0);function Bs(l,t){l=$t,O(Fu,l),O(da,t),$t=l|t.baseLanes}function Qc(){O(Fu,$t),O(da,da.current)}function Xc(){$t=Fu.current,v(da),v(Fu)}var dt=d(null),Tt=null;function fe(l){var t=l.alternate;O(Nl,Nl.current&1),O(dt,l),Tt===null&&(t===null||da.current!==null||t.memoizedState!==null)&&(Tt=l)}function Zc(l){O(Nl,Nl.current),O(dt,l),Tt===null&&(Tt=l)}function qs(l){l.tag===22?(O(Nl,Nl.current),O(dt,l),Tt===null&&(Tt=l)):se()}function se(){O(Nl,Nl.current),O(dt,dt.current)}function ot(l){v(dt),Tt===l&&(Tt=null),v(Nl)}var Nl=d(0);function Iu(l){for(var t=l;t!==null;){if(t.tag===13){var e=t.memoizedState;if(e!==null&&(e=e.dehydrated,e===null||Wi(e)||$i(e)))return t}else if(t.tag===19&&(t.memoizedProps.revealOrder==="forwards"||t.memoizedProps.revealOrder==="backwards"||t.memoizedProps.revealOrder==="unstable_legacy-backwards"||t.memoizedProps.revealOrder==="together")){if((t.flags&128)!==0)return t}else if(t.child!==null){t.child.return=t,t=t.child;continue}if(t===l)break;for(;t.sibling===null;){if(t.return===null||t.return===l)return null;t=t.return}t.sibling.return=t.return,t=t.sibling}return null}var Xt=0,J=null,gl=null,Cl=null,Pu=!1,oa=!1,Ye=!1,ln=0,ka=0,ma=null,Jm=0;function Ol(){throw Error(o(321))}function Vc(l,t){if(t===null)return!1;for(var e=0;e<t.length&&e<l.length;e++)if(!ft(l[e],t[e]))return!1;return!0}function Lc(l,t,e,a,u,n){return Xt=n,J=t,t.memoizedState=null,t.updateQueue=null,t.lanes=0,z.H=l===null||l.memoizedState===null?z0:ni,Ye=!1,n=e(a,u),Ye=!1,oa&&(n=Gs(t,e,a,u)),Ys(l),n}function Ys(l){z.H=Pa;var t=gl!==null&&gl.next!==null;if(Xt=0,Cl=gl=J=null,Pu=!1,ka=0,ma=null,t)throw Error(o(300));l===null||Rl||(l=l.dependencies,l!==null&&Lu(l)&&(Rl=!0))}function Gs(l,t,e,a){J=l;var u=0;do{if(oa&&(ma=null),ka=0,oa=!1,25<=u)throw Error(o(301));if(u+=1,Cl=gl=null,l.updateQueue!=null){var n=l.updateQueue;n.lastEffect=null,n.events=null,n.stores=null,n.memoCache!=null&&(n.memoCache.index=0)}z.H=E0,n=t(e,a)}while(oa);return n}function wm(){var l=z.H,t=l.useState()[0];return t=typeof t.then=="function"?Fa(t):t,l=l.useState()[0],(gl!==null?gl.memoizedState:null)!==l&&(J.flags|=1024),t}function Kc(){var l=ln!==0;return ln=0,l}function Jc(l,t,e){t.updateQueue=l.updateQueue,t.flags&=-2053,l.lanes&=~e}function wc(l){if(Pu){for(l=l.memoizedState;l!==null;){var t=l.queue;t!==null&&(t.pending=null),l=l.next}Pu=!1}Xt=0,Cl=gl=J=null,oa=!1,ka=ln=0,ma=null}function Wl(){var l={memoizedState:null,baseState:null,baseQueue:null,queue:null,next:null};return Cl===null?J.memoizedState=Cl=l:Cl=Cl.next=l,Cl}function Dl(){if(gl===null){var l=J.alternate;l=l!==null?l.memoizedState:null}else l=gl.next;var t=Cl===null?J.memoizedState:Cl.next;if(t!==null)Cl=t,gl=l;else{if(l===null)throw J.alternate===null?Error(o(467)):Error(o(310));gl=l,l={memoizedState:gl.memoizedState,baseState:gl.baseState,baseQueue:gl.baseQueue,queue:gl.queue,next:null},Cl===null?J.memoizedState=Cl=l:Cl=Cl.next=l}return Cl}function tn(){return{lastEffect:null,events:null,stores:null,memoCache:null}}function Fa(l){var t=ka;return ka+=1,ma===null&&(ma=[]),l=Ns(ma,l,t),t=J,(Cl===null?t.memoizedState:Cl.next)===null&&(t=t.alternate,z.H=t===null||t.memoizedState===null?z0:ni),l}function en(l){if(l!==null&&typeof l=="object"){if(typeof l.then=="function")return Fa(l);if(l.$$typeof===ul)return Zl(l)}throw Error(o(438,String(l)))}function Wc(l){var t=null,e=J.updateQueue;if(e!==null&&(t=e.memoCache),t==null){var a=J.alternate;a!==null&&(a=a.updateQueue,a!==null&&(a=a.memoCache,a!=null&&(t={data:a.data.map(function(u){return u.slice()}),index:0})))}if(t==null&&(t={data:[],index:0}),e===null&&(e=tn(),J.updateQueue=e),e.memoCache=t,e=t.data[t.index],e===void 0)for(e=t.data[t.index]=Array(l),a=0;a<l;a++)e[a]=$l;return t.index++,e}function Zt(l,t){return typeof t=="function"?t(l):t}function an(l){var t=Dl();return $c(t,gl,l)}function $c(l,t,e){var a=l.queue;if(a===null)throw Error(o(311));a.lastRenderedReducer=e;var u=l.baseQueue,n=a.pending;if(n!==null){if(u!==null){var c=u.next;u.next=n.next,n.next=c}t.baseQueue=u=n,a.pending=null}if(n=l.baseState,u===null)l.memoizedState=n;else{t=u.next;var i=c=null,f=null,r=t,b=!1;do{var A=r.lane&-536870913;if(A!==r.lane?(P&A)===A:(Xt&A)===A){var h=r.revertLane;if(h===0)f!==null&&(f=f.next={lane:0,revertLane:0,gesture:null,action:r.action,hasEagerState:r.hasEagerState,eagerState:r.eagerState,next:null}),A===ca&&(b=!0);else if((Xt&h)===h){r=r.next,h===ca&&(b=!0);continue}else A={lane:0,revertLane:r.revertLane,gesture:null,action:r.action,hasEagerState:r.hasEagerState,eagerState:r.eagerState,next:null},f===null?(i=f=A,c=n):f=f.next=A,J.lanes|=h,me|=h;A=r.action,Ye&&e(n,A),n=r.hasEagerState?r.eagerState:e(n,A)}else h={lane:A,revertLane:r.revertLane,gesture:r.gesture,action:r.action,hasEagerState:r.hasEagerState,eagerState:r.eagerState,next:null},f===null?(i=f=h,c=n):f=f.next=h,J.lanes|=A,me|=A;r=r.next}while(r!==null&&r!==t);if(f===null?c=n:f.next=i,!ft(n,l.memoizedState)&&(Rl=!0,b&&(e=ia,e!==null)))throw e;l.memoizedState=n,l.baseState=c,l.baseQueue=f,a.lastRenderedState=n}return u===null&&(a.lanes=0),[l.memoizedState,a.dispatch]}function kc(l){var t=Dl(),e=t.queue;if(e===null)throw Error(o(311));e.lastRenderedReducer=l;var a=e.dispatch,u=e.pending,n=t.memoizedState;if(u!==null){e.pending=null;var c=u=u.next;do n=l(n,c.action),c=c.next;while(c!==u);ft(n,t.memoizedState)||(Rl=!0),t.memoizedState=n,t.baseQueue===null&&(t.baseState=n),e.lastRenderedState=n}return[n,a]}function Qs(l,t,e){var a=J,u=Dl(),n=el;if(n){if(e===void 0)throw Error(o(407));e=e()}else e=t();var c=!ft((gl||u).memoizedState,e);if(c&&(u.memoizedState=e,Rl=!0),u=u.queue,Pc(Vs.bind(null,a,u,l),[l]),u.getSnapshot!==t||c||Cl!==null&&Cl.memoizedState.tag&1){if(a.flags|=2048,ya(9,{destroy:void 0},Zs.bind(null,a,u,e,t),null),zl===null)throw Error(o(349));n||(Xt&127)!==0||Xs(a,t,e)}return e}function Xs(l,t,e){l.flags|=16384,l={getSnapshot:t,value:e},t=J.updateQueue,t===null?(t=tn(),J.updateQueue=t,t.stores=[l]):(e=t.stores,e===null?t.stores=[l]:e.push(l))}function Zs(l,t,e,a){t.value=e,t.getSnapshot=a,Ls(t)&&Ks(l)}function Vs(l,t,e){return e(function(){Ls(t)&&Ks(l)})}function Ls(l){var t=l.getSnapshot;l=l.value;try{var e=t();return!ft(l,e)}catch{return!0}}function Ks(l){var t=Ne(l,2);t!==null&&at(t,l,2)}function Fc(l){var t=Wl();if(typeof l=="function"){var e=l;if(l=e(),Ye){It(!0);try{e()}finally{It(!1)}}}return t.memoizedState=t.baseState=l,t.queue={pending:null,lanes:0,dispatch:null,lastRenderedReducer:Zt,lastRenderedState:l},t}function Js(l,t,e,a){return l.baseState=e,$c(l,gl,typeof a=="function"?a:Zt)}function Wm(l,t,e,a,u){if(cn(l))throw Error(o(485));if(l=t.action,l!==null){var n={payload:u,action:l,next:null,isTransition:!0,status:"pending",value:null,reason:null,listeners:[],then:function(c){n.listeners.push(c)}};z.T!==null?e(!0):n.isTransition=!1,a(n),e=t.pending,e===null?(n.next=t.pending=n,ws(t,n)):(n.next=e.next,t.pending=e.next=n)}}function ws(l,t){var e=t.action,a=t.payload,u=l.state;if(t.isTransition){var n=z.T,c={};z.T=c;try{var i=e(u,a),f=z.S;f!==null&&f(c,i),Ws(l,t,i)}catch(r){Ic(l,t,r)}finally{n!==null&&c.types!==null&&(n.types=c.types),z.T=n}}else try{n=e(u,a),Ws(l,t,n)}catch(r){Ic(l,t,r)}}function Ws(l,t,e){e!==null&&typeof e=="object"&&typeof e.then=="function"?e.then(function(a){$s(l,t,a)},function(a){return Ic(l,t,a)}):$s(l,t,e)}function $s(l,t,e){t.status="fulfilled",t.value=e,ks(t),l.state=e,t=l.pending,t!==null&&(e=t.next,e===t?l.pending=null:(e=e.next,t.next=e,ws(l,e)))}function Ic(l,t,e){var a=l.pending;if(l.pending=null,a!==null){a=a.next;do t.status="rejected",t.reason=e,ks(t),t=t.next;while(t!==a)}l.action=null}function ks(l){l=l.listeners;for(var t=0;t<l.length;t++)(0,l[t])()}function Fs(l,t){return t}function Is(l,t){if(el){var e=zl.formState;if(e!==null){l:{var a=J;if(el){if(Tl){t:{for(var u=Tl,n=Et;u.nodeType!==8;){if(!n){u=null;break t}if(u=At(u.nextSibling),u===null){u=null;break t}}n=u.data,u=n==="F!"||n==="F"?u:null}if(u){Tl=At(u.nextSibling),a=u.data==="F!";break l}}ae(a)}a=!1}a&&(t=e[0])}}return e=Wl(),e.memoizedState=e.baseState=t,a={pending:null,lanes:0,dispatch:null,lastRenderedReducer:Fs,lastRenderedState:t},e.queue=a,e=g0.bind(null,J,a),a.dispatch=e,a=Fc(!1),n=ui.bind(null,J,!1,a.queue),a=Wl(),u={state:t,dispatch:null,action:l,pending:null},a.queue=u,e=Wm.bind(null,J,u,n,e),u.dispatch=e,a.memoizedState=l,[t,e,!1]}function Ps(l){var t=Dl();return l0(t,gl,l)}function l0(l,t,e){if(t=$c(l,t,Fs)[0],l=an(Zt)[0],typeof t=="object"&&t!==null&&typeof t.then=="function")try{var a=Fa(t)}catch(c){throw c===fa?wu:c}else a=t;t=Dl();var u=t.queue,n=u.dispatch;return e!==t.memoizedState&&(J.flags|=2048,ya(9,{destroy:void 0},$m.bind(null,u,e),null)),[a,n,l]}function $m(l,t){l.action=t}function t0(l){var t=Dl(),e=gl;if(e!==null)return l0(t,e,l);Dl(),t=t.memoizedState,e=Dl();var a=e.queue.dispatch;return e.memoizedState=l,[t,a,!1]}function ya(l,t,e,a){return l={tag:l,create:e,deps:a,inst:t,next:null},t=J.updateQueue,t===null&&(t=tn(),J.updateQueue=t),e=t.lastEffect,e===null?t.lastEffect=l.next=l:(a=e.next,e.next=l,l.next=a,t.lastEffect=l),l}function e0(){return Dl().memoizedState}function un(l,t,e,a){var u=Wl();J.flags|=l,u.memoizedState=ya(1|t,{destroy:void 0},e,a===void 0?null:a)}function nn(l,t,e,a){var u=Dl();a=a===void 0?null:a;var n=u.memoizedState.inst;gl!==null&&a!==null&&Vc(a,gl.memoizedState.deps)?u.memoizedState=ya(t,n,e,a):(J.flags|=l,u.memoizedState=ya(1|t,n,e,a))}function a0(l,t){un(8390656,8,l,t)}function Pc(l,t){nn(2048,8,l,t)}function km(l){J.flags|=4;var t=J.updateQueue;if(t===null)t=tn(),J.updateQueue=t,t.events=[l];else{var e=t.events;e===null?t.events=[l]:e.push(l)}}function u0(l){var t=Dl().memoizedState;return km({ref:t,nextImpl:l}),function(){if((sl&2)!==0)throw Error(o(440));return t.impl.apply(void 0,arguments)}}function n0(l,t){return nn(4,2,l,t)}function c0(l,t){return nn(4,4,l,t)}function i0(l,t){if(typeof t=="function"){l=l();var e=t(l);return function(){typeof e=="function"?e():t(null)}}if(t!=null)return l=l(),t.current=l,function(){t.current=null}}function f0(l,t,e){e=e!=null?e.concat([l]):null,nn(4,4,i0.bind(null,t,l),e)}function li(){}function s0(l,t){var e=Dl();t=t===void 0?null:t;var a=e.memoizedState;return t!==null&&Vc(t,a[1])?a[0]:(e.memoizedState=[l,t],l)}function d0(l,t){var e=Dl();t=t===void 0?null:t;var a=e.memoizedState;if(t!==null&&Vc(t,a[1]))return a[0];if(a=l(),Ye){It(!0);try{l()}finally{It(!1)}}return e.memoizedState=[a,t],a}function ti(l,t,e){return e===void 0||(Xt&1073741824)!==0&&(P&261930)===0?l.memoizedState=t:(l.memoizedState=e,l=od(),J.lanes|=l,me|=l,e)}function o0(l,t,e,a){return ft(e,t)?e:da.current!==null?(l=ti(l,e,a),ft(l,t)||(Rl=!0),l):(Xt&42)===0||(Xt&1073741824)!==0&&(P&261930)===0?(Rl=!0,l.memoizedState=e):(l=od(),J.lanes|=l,me|=l,t)}function m0(l,t,e,a,u){var n=N.p;N.p=n!==0&&8>n?n:8;var c=z.T,i={};z.T=i,ui(l,!1,t,e);try{var f=u(),r=z.S;if(r!==null&&r(i,f),f!==null&&typeof f=="object"&&typeof f.then=="function"){var b=Km(f,a);Ia(l,t,b,rt(l))}else Ia(l,t,a,rt(l))}catch(A){Ia(l,t,{then:function(){},status:"rejected",reason:A},rt())}finally{N.p=n,c!==null&&i.types!==null&&(c.types=i.types),z.T=c}}function Fm(){}function ei(l,t,e,a){if(l.tag!==5)throw Error(o(476));var u=y0(l).queue;m0(l,u,t,V,e===null?Fm:function(){return r0(l),e(a)})}function y0(l){var t=l.memoizedState;if(t!==null)return t;t={memoizedState:V,baseState:V,baseQueue:null,queue:{pending:null,lanes:0,dispatch:null,lastRenderedReducer:Zt,lastRenderedState:V},next:null};var e={};return t.next={memoizedState:e,baseState:e,baseQueue:null,queue:{pending:null,lanes:0,dispatch:null,lastRenderedReducer:Zt,lastRenderedState:e},next:null},l.memoizedState=t,l=l.alternate,l!==null&&(l.memoizedState=t),t}function r0(l){var t=y0(l);t.next===null&&(t=l.alternate.memoizedState),Ia(l,t.next.queue,{},rt())}function ai(){return Zl(hu)}function h0(){return Dl().memoizedState}function v0(){return Dl().memoizedState}function Im(l){for(var t=l.return;t!==null;){switch(t.tag){case 24:case 3:var e=rt();l=ce(e);var a=ie(t,l,e);a!==null&&(at(a,t,e),wa(a,t,e)),t={cache:Cc()},l.payload=t;return}t=t.return}}function Pm(l,t,e){var a=rt();e={lane:a,revertLane:0,gesture:null,action:e,hasEagerState:!1,eagerState:null,next:null},cn(l)?S0(t,e):(e=Ec(l,t,e,a),e!==null&&(at(e,l,a),b0(e,t,a)))}function g0(l,t,e){var a=rt();Ia(l,t,e,a)}function Ia(l,t,e,a){var u={lane:a,revertLane:0,gesture:null,action:e,hasEagerState:!1,eagerState:null,next:null};if(cn(l))S0(t,u);else{var n=l.alternate;if(l.lanes===0&&(n===null||n.lanes===0)&&(n=t.lastRenderedReducer,n!==null))try{var c=t.lastRenderedState,i=n(c,e);if(u.hasEagerState=!0,u.eagerState=i,ft(i,c))return Qu(l,t,u,0),zl===null&&Gu(),!1}catch{}if(e=Ec(l,t,u,a),e!==null)return at(e,l,a),b0(e,t,a),!0}return!1}function ui(l,t,e,a){if(a={lane:2,revertLane:Bi(),gesture:null,action:a,hasEagerState:!1,eagerState:null,next:null},cn(l)){if(t)throw Error(o(479))}else t=Ec(l,e,a,2),t!==null&&at(t,l,2)}function cn(l){var t=l.alternate;return l===J||t!==null&&t===J}function S0(l,t){oa=Pu=!0;var e=l.pending;e===null?t.next=t:(t.next=e.next,e.next=t),l.pending=t}function b0(l,t,e){if((e&4194048)!==0){var a=t.lanes;a&=l.pendingLanes,e|=a,t.lanes=e,pf(l,e)}}var Pa={readContext:Zl,use:en,useCallback:Ol,useContext:Ol,useEffect:Ol,useImperativeHandle:Ol,useLayoutEffect:Ol,useInsertionEffect:Ol,useMemo:Ol,useReducer:Ol,useRef:Ol,useState:Ol,useDebugValue:Ol,useDeferredValue:Ol,useTransition:Ol,useSyncExternalStore:Ol,useId:Ol,useHostTransitionStatus:Ol,useFormState:Ol,useActionState:Ol,useOptimistic:Ol,useMemoCache:Ol,useCacheRefresh:Ol};Pa.useEffectEvent=Ol;var z0={readContext:Zl,use:en,useCallback:function(l,t){return Wl().memoizedState=[l,t===void 0?null:t],l},useContext:Zl,useEffect:a0,useImperativeHandle:function(l,t,e){e=e!=null?e.concat([l]):null,un(4194308,4,i0.bind(null,t,l),e)},useLayoutEffect:function(l,t){return un(4194308,4,l,t)},useInsertionEffect:function(l,t){un(4,2,l,t)},useMemo:function(l,t){var e=Wl();t=t===void 0?null:t;var a=l();if(Ye){It(!0);try{l()}finally{It(!1)}}return e.memoizedState=[a,t],a},useReducer:function(l,t,e){var a=Wl();if(e!==void 0){var u=e(t);if(Ye){It(!0);try{e(t)}finally{It(!1)}}}else u=t;return a.memoizedState=a.baseState=u,l={pending:null,lanes:0,dispatch:null,lastRenderedReducer:l,lastRenderedState:u},a.queue=l,l=l.dispatch=Pm.bind(null,J,l),[a.memoizedState,l]},useRef:function(l){var t=Wl();return l={current:l},t.memoizedState=l},useState:function(l){l=Fc(l);var t=l.queue,e=g0.bind(null,J,t);return t.dispatch=e,[l.memoizedState,e]},useDebugValue:li,useDeferredValue:function(l,t){var e=Wl();return ti(e,l,t)},useTransition:function(){var l=Fc(!1);return l=m0.bind(null,J,l.queue,!0,!1),Wl().memoizedState=l,[!1,l]},useSyncExternalStore:function(l,t,e){var a=J,u=Wl();if(el){if(e===void 0)throw Error(o(407));e=e()}else{if(e=t(),zl===null)throw Error(o(349));(P&127)!==0||Xs(a,t,e)}u.memoizedState=e;var n={value:e,getSnapshot:t};return u.queue=n,a0(Vs.bind(null,a,n,l),[l]),a.flags|=2048,ya(9,{destroy:void 0},Zs.bind(null,a,n,e,t),null),e},useId:function(){var l=Wl(),t=zl.identifierPrefix;if(el){var e=Dt,a=Nt;e=(a&~(1<<32-it(a)-1)).toString(32)+e,t="_"+t+"R_"+e,e=ln++,0<e&&(t+="H"+e.toString(32)),t+="_"}else e=Jm++,t="_"+t+"r_"+e.toString(32)+"_";return l.memoizedState=t},useHostTransitionStatus:ai,useFormState:Is,useActionState:Is,useOptimistic:function(l){var t=Wl();t.memoizedState=t.baseState=l;var e={pending:null,lanes:0,dispatch:null,lastRenderedReducer:null,lastRenderedState:null};return t.queue=e,t=ui.bind(null,J,!0,e),e.dispatch=t,[l,t]},useMemoCache:Wc,useCacheRefresh:function(){return Wl().memoizedState=Im.bind(null,J)},useEffectEvent:function(l){var t=Wl(),e={impl:l};return t.memoizedState=e,function(){if((sl&2)!==0)throw Error(o(440));return e.impl.apply(void 0,arguments)}}},ni={readContext:Zl,use:en,useCallback:s0,useContext:Zl,useEffect:Pc,useImperativeHandle:f0,useInsertionEffect:n0,useLayoutEffect:c0,useMemo:d0,useReducer:an,useRef:e0,useState:function(){return an(Zt)},useDebugValue:li,useDeferredValue:function(l,t){var e=Dl();return o0(e,gl.memoizedState,l,t)},useTransition:function(){var l=an(Zt)[0],t=Dl().memoizedState;return[typeof l=="boolean"?l:Fa(l),t]},useSyncExternalStore:Qs,useId:h0,useHostTransitionStatus:ai,useFormState:Ps,useActionState:Ps,useOptimistic:function(l,t){var e=Dl();return Js(e,gl,l,t)},useMemoCache:Wc,useCacheRefresh:v0};ni.useEffectEvent=u0;var E0={readContext:Zl,use:en,useCallback:s0,useContext:Zl,useEffect:Pc,useImperativeHandle:f0,useInsertionEffect:n0,useLayoutEffect:c0,useMemo:d0,useReducer:kc,useRef:e0,useState:function(){return kc(Zt)},useDebugValue:li,useDeferredValue:function(l,t){var e=Dl();return gl===null?ti(e,l,t):o0(e,gl.memoizedState,l,t)},useTransition:function(){var l=kc(Zt)[0],t=Dl().memoizedState;return[typeof l=="boolean"?l:Fa(l),t]},useSyncExternalStore:Qs,useId:h0,useHostTransitionStatus:ai,useFormState:t0,useActionState:t0,useOptimistic:function(l,t){var e=Dl();return gl!==null?Js(e,gl,l,t):(e.baseState=l,[l,e.queue.dispatch])},useMemoCache:Wc,useCacheRefresh:v0};E0.useEffectEvent=u0;function ci(l,t,e,a){t=l.memoizedState,e=e(a,t),e=e==null?t:U({},t,e),l.memoizedState=e,l.lanes===0&&(l.updateQueue.baseState=e)}var ii={enqueueSetState:function(l,t,e){l=l._reactInternals;var a=rt(),u=ce(a);u.payload=t,e!=null&&(u.callback=e),t=ie(l,u,a),t!==null&&(at(t,l,a),wa(t,l,a))},enqueueReplaceState:function(l,t,e){l=l._reactInternals;var a=rt(),u=ce(a);u.tag=1,u.payload=t,e!=null&&(u.callback=e),t=ie(l,u,a),t!==null&&(at(t,l,a),wa(t,l,a))},enqueueForceUpdate:function(l,t){l=l._reactInternals;var e=rt(),a=ce(e);a.tag=2,t!=null&&(a.callback=t),t=ie(l,a,e),t!==null&&(at(t,l,e),wa(t,l,e))}};function T0(l,t,e,a,u,n,c){return l=l.stateNode,typeof l.shouldComponentUpdate=="function"?l.shouldComponentUpdate(a,n,c):t.prototype&&t.prototype.isPureReactComponent?!Ga(e,a)||!Ga(u,n):!0}function A0(l,t,e,a){l=t.state,typeof t.componentWillReceiveProps=="function"&&t.componentWillReceiveProps(e,a),typeof t.UNSAFE_componentWillReceiveProps=="function"&&t.UNSAFE_componentWillReceiveProps(e,a),t.state!==l&&ii.enqueueReplaceState(t,t.state,null)}function Ge(l,t){var e=t;if("ref"in t){e={};for(var a in t)a!=="ref"&&(e[a]=t[a])}if(l=l.defaultProps){e===t&&(e=U({},e));for(var u in l)e[u]===void 0&&(e[u]=l[u])}return e}function p0(l){Yu(l)}function _0(l){console.error(l)}function O0(l){Yu(l)}function fn(l,t){try{var e=l.onUncaughtError;e(t.value,{componentStack:t.stack})}catch(a){setTimeout(function(){throw a})}}function M0(l,t,e){try{var a=l.onCaughtError;a(e.value,{componentStack:e.stack,errorBoundary:t.tag===1?t.stateNode:null})}catch(u){setTimeout(function(){throw u})}}function fi(l,t,e){return e=ce(e),e.tag=3,e.payload={element:null},e.callback=function(){fn(l,t)},e}function x0(l){return l=ce(l),l.tag=3,l}function N0(l,t,e,a){var u=e.type.getDerivedStateFromError;if(typeof u=="function"){var n=a.value;l.payload=function(){return u(n)},l.callback=function(){M0(t,e,a)}}var c=e.stateNode;c!==null&&typeof c.componentDidCatch=="function"&&(l.callback=function(){M0(t,e,a),typeof u!="function"&&(ye===null?ye=new Set([this]):ye.add(this));var i=a.stack;this.componentDidCatch(a.value,{componentStack:i!==null?i:""})})}function ly(l,t,e,a,u){if(e.flags|=32768,a!==null&&typeof a=="object"&&typeof a.then=="function"){if(t=e.alternate,t!==null&&na(t,e,u,!0),e=dt.current,e!==null){switch(e.tag){case 31:case 13:return Tt===null?zn():e.alternate===null&&Ml===0&&(Ml=3),e.flags&=-257,e.flags|=65536,e.lanes=u,a===Wu?e.flags|=16384:(t=e.updateQueue,t===null?e.updateQueue=new Set([a]):t.add(a),Ri(l,a,u)),!1;case 22:return e.flags|=65536,a===Wu?e.flags|=16384:(t=e.updateQueue,t===null?(t={transitions:null,markerInstances:null,retryQueue:new Set([a])},e.updateQueue=t):(e=t.retryQueue,e===null?t.retryQueue=new Set([a]):e.add(a)),Ri(l,a,u)),!1}throw Error(o(435,e.tag))}return Ri(l,a,u),zn(),!1}if(el)return t=dt.current,t!==null?((t.flags&65536)===0&&(t.flags|=256),t.flags|=65536,t.lanes=u,a!==Mc&&(l=Error(o(422),{cause:a}),Za(St(l,e)))):(a!==Mc&&(t=Error(o(423),{cause:a}),Za(St(t,e))),l=l.current.alternate,l.flags|=65536,u&=-u,l.lanes|=u,a=St(a,e),u=fi(l.stateNode,a,u),Yc(l,u),Ml!==4&&(Ml=2)),!1;var n=Error(o(520),{cause:a});if(n=St(n,e),iu===null?iu=[n]:iu.push(n),Ml!==4&&(Ml=2),t===null)return!0;a=St(a,e),e=t;do{switch(e.tag){case 3:return e.flags|=65536,l=u&-u,e.lanes|=l,l=fi(e.stateNode,a,l),Yc(e,l),!1;case 1:if(t=e.type,n=e.stateNode,(e.flags&128)===0&&(typeof t.getDerivedStateFromError=="function"||n!==null&&typeof n.componentDidCatch=="function"&&(ye===null||!ye.has(n))))return e.flags|=65536,u&=-u,e.lanes|=u,u=x0(u),N0(u,l,e,a),Yc(e,u),!1}e=e.return}while(e!==null);return!1}var si=Error(o(461)),Rl=!1;function Vl(l,t,e,a){t.child=l===null?Rs(t,null,e,a):qe(t,l.child,e,a)}function D0(l,t,e,a,u){e=e.render;var n=t.ref;if("ref"in a){var c={};for(var i in a)i!=="ref"&&(c[i]=a[i])}else c=a;return Re(t),a=Lc(l,t,e,c,n,u),i=Kc(),l!==null&&!Rl?(Jc(l,t,u),Vt(l,t,u)):(el&&i&&_c(t),t.flags|=1,Vl(l,t,a,u),t.child)}function U0(l,t,e,a,u){if(l===null){var n=e.type;return typeof n=="function"&&!Tc(n)&&n.defaultProps===void 0&&e.compare===null?(t.tag=15,t.type=n,C0(l,t,n,a,u)):(l=Zu(e.type,null,a,t,t.mode,u),l.ref=t.ref,l.return=t,t.child=l)}if(n=l.child,!gi(l,u)){var c=n.memoizedProps;if(e=e.compare,e=e!==null?e:Ga,e(c,a)&&l.ref===t.ref)return Vt(l,t,u)}return t.flags|=1,l=qt(n,a),l.ref=t.ref,l.return=t,t.child=l}function C0(l,t,e,a,u){if(l!==null){var n=l.memoizedProps;if(Ga(n,a)&&l.ref===t.ref)if(Rl=!1,t.pendingProps=a=n,gi(l,u))(l.flags&131072)!==0&&(Rl=!0);else return t.lanes=l.lanes,Vt(l,t,u)}return di(l,t,e,a,u)}function R0(l,t,e,a){var u=a.children,n=l!==null?l.memoizedState:null;if(l===null&&t.stateNode===null&&(t.stateNode={_visibility:1,_pendingMarkers:null,_retryCache:null,_transitions:null}),a.mode==="hidden"){if((t.flags&128)!==0){if(n=n!==null?n.baseLanes|e:e,l!==null){for(a=t.child=l.child,u=0;a!==null;)u=u|a.lanes|a.childLanes,a=a.sibling;a=u&~n}else a=0,t.child=null;return j0(l,t,n,e,a)}if((e&536870912)!==0)t.memoizedState={baseLanes:0,cachePool:null},l!==null&&Ju(t,n!==null?n.cachePool:null),n!==null?Bs(t,n):Qc(),qs(t);else return a=t.lanes=536870912,j0(l,t,n!==null?n.baseLanes|e:e,e,a)}else n!==null?(Ju(t,n.cachePool),Bs(t,n),se(),t.memoizedState=null):(l!==null&&Ju(t,null),Qc(),se());return Vl(l,t,u,e),t.child}function lu(l,t){return l!==null&&l.tag===22||t.stateNode!==null||(t.stateNode={_visibility:1,_pendingMarkers:null,_retryCache:null,_transitions:null}),t.sibling}function j0(l,t,e,a,u){var n=jc();return n=n===null?null:{parent:Ul._currentValue,pool:n},t.memoizedState={baseLanes:e,cachePool:n},l!==null&&Ju(t,null),Qc(),qs(t),l!==null&&na(l,t,a,!0),t.childLanes=u,null}function sn(l,t){return t=on({mode:t.mode,children:t.children},l.mode),t.ref=l.ref,l.child=t,t.return=l,t}function H0(l,t,e){return qe(t,l.child,null,e),l=sn(t,t.pendingProps),l.flags|=2,ot(t),t.memoizedState=null,l}function ty(l,t,e){var a=t.pendingProps,u=(t.flags&128)!==0;if(t.flags&=-129,l===null){if(el){if(a.mode==="hidden")return l=sn(t,a),t.lanes=536870912,lu(null,l);if(Zc(t),(l=Tl)?(l=wd(l,Et),l=l!==null&&l.data==="&"?l:null,l!==null&&(t.memoizedState={dehydrated:l,treeContext:te!==null?{id:Nt,overflow:Dt}:null,retryLane:536870912,hydrationErrors:null},e=Ss(l),e.return=t,t.child=e,Xl=t,Tl=null)):l=null,l===null)throw ae(t);return t.lanes=536870912,null}return sn(t,a)}var n=l.memoizedState;if(n!==null){var c=n.dehydrated;if(Zc(t),u)if(t.flags&256)t.flags&=-257,t=H0(l,t,e);else if(t.memoizedState!==null)t.child=l.child,t.flags|=128,t=null;else throw Error(o(558));else if(Rl||na(l,t,e,!1),u=(e&l.childLanes)!==0,Rl||u){if(a=zl,a!==null&&(c=_f(a,e),c!==0&&c!==n.retryLane))throw n.retryLane=c,Ne(l,c),at(a,l,c),si;zn(),t=H0(l,t,e)}else l=n.treeContext,Tl=At(c.nextSibling),Xl=t,el=!0,ee=null,Et=!1,l!==null&&Es(t,l),t=sn(t,a),t.flags|=4096;return t}return l=qt(l.child,{mode:a.mode,children:a.children}),l.ref=t.ref,t.child=l,l.return=t,l}function dn(l,t){var e=t.ref;if(e===null)l!==null&&l.ref!==null&&(t.flags|=4194816);else{if(typeof e!="function"&&typeof e!="object")throw Error(o(284));(l===null||l.ref!==e)&&(t.flags|=4194816)}}function di(l,t,e,a,u){return Re(t),e=Lc(l,t,e,a,void 0,u),a=Kc(),l!==null&&!Rl?(Jc(l,t,u),Vt(l,t,u)):(el&&a&&_c(t),t.flags|=1,Vl(l,t,e,u),t.child)}function B0(l,t,e,a,u,n){return Re(t),t.updateQueue=null,e=Gs(t,a,e,u),Ys(l),a=Kc(),l!==null&&!Rl?(Jc(l,t,n),Vt(l,t,n)):(el&&a&&_c(t),t.flags|=1,Vl(l,t,e,n),t.child)}function q0(l,t,e,a,u){if(Re(t),t.stateNode===null){var n=ta,c=e.contextType;typeof c=="object"&&c!==null&&(n=Zl(c)),n=new e(a,n),t.memoizedState=n.state!==null&&n.state!==void 0?n.state:null,n.updater=ii,t.stateNode=n,n._reactInternals=t,n=t.stateNode,n.props=a,n.state=t.memoizedState,n.refs={},Bc(t),c=e.contextType,n.context=typeof c=="object"&&c!==null?Zl(c):ta,n.state=t.memoizedState,c=e.getDerivedStateFromProps,typeof c=="function"&&(ci(t,e,c,a),n.state=t.memoizedState),typeof e.getDerivedStateFromProps=="function"||typeof n.getSnapshotBeforeUpdate=="function"||typeof n.UNSAFE_componentWillMount!="function"&&typeof n.componentWillMount!="function"||(c=n.state,typeof n.componentWillMount=="function"&&n.componentWillMount(),typeof n.UNSAFE_componentWillMount=="function"&&n.UNSAFE_componentWillMount(),c!==n.state&&ii.enqueueReplaceState(n,n.state,null),$a(t,a,n,u),Wa(),n.state=t.memoizedState),typeof n.componentDidMount=="function"&&(t.flags|=4194308),a=!0}else if(l===null){n=t.stateNode;var i=t.memoizedProps,f=Ge(e,i);n.props=f;var r=n.context,b=e.contextType;c=ta,typeof b=="object"&&b!==null&&(c=Zl(b));var A=e.getDerivedStateFromProps;b=typeof A=="function"||typeof n.getSnapshotBeforeUpdate=="function",i=t.pendingProps!==i,b||typeof n.UNSAFE_componentWillReceiveProps!="function"&&typeof n.componentWillReceiveProps!="function"||(i||r!==c)&&A0(t,n,a,c),ne=!1;var h=t.memoizedState;n.state=h,$a(t,a,n,u),Wa(),r=t.memoizedState,i||h!==r||ne?(typeof A=="function"&&(ci(t,e,A,a),r=t.memoizedState),(f=ne||T0(t,e,f,a,h,r,c))?(b||typeof n.UNSAFE_componentWillMount!="function"&&typeof n.componentWillMount!="function"||(typeof n.componentWillMount=="function"&&n.componentWillMount(),typeof n.UNSAFE_componentWillMount=="function"&&n.UNSAFE_componentWillMount()),typeof n.componentDidMount=="function"&&(t.flags|=4194308)):(typeof n.componentDidMount=="function"&&(t.flags|=4194308),t.memoizedProps=a,t.memoizedState=r),n.props=a,n.state=r,n.context=c,a=f):(typeof n.componentDidMount=="function"&&(t.flags|=4194308),a=!1)}else{n=t.stateNode,qc(l,t),c=t.memoizedProps,b=Ge(e,c),n.props=b,A=t.pendingProps,h=n.context,r=e.contextType,f=ta,typeof r=="object"&&r!==null&&(f=Zl(r)),i=e.getDerivedStateFromProps,(r=typeof i=="function"||typeof n.getSnapshotBeforeUpdate=="function")||typeof n.UNSAFE_componentWillReceiveProps!="function"&&typeof n.componentWillReceiveProps!="function"||(c!==A||h!==f)&&A0(t,n,a,f),ne=!1,h=t.memoizedState,n.state=h,$a(t,a,n,u),Wa();var S=t.memoizedState;c!==A||h!==S||ne||l!==null&&l.dependencies!==null&&Lu(l.dependencies)?(typeof i=="function"&&(ci(t,e,i,a),S=t.memoizedState),(b=ne||T0(t,e,b,a,h,S,f)||l!==null&&l.dependencies!==null&&Lu(l.dependencies))?(r||typeof n.UNSAFE_componentWillUpdate!="function"&&typeof n.componentWillUpdate!="function"||(typeof n.componentWillUpdate=="function"&&n.componentWillUpdate(a,S,f),typeof n.UNSAFE_componentWillUpdate=="function"&&n.UNSAFE_componentWillUpdate(a,S,f)),typeof n.componentDidUpdate=="function"&&(t.flags|=4),typeof n.getSnapshotBeforeUpdate=="function"&&(t.flags|=1024)):(typeof n.componentDidUpdate!="function"||c===l.memoizedProps&&h===l.memoizedState||(t.flags|=4),typeof n.getSnapshotBeforeUpdate!="function"||c===l.memoizedProps&&h===l.memoizedState||(t.flags|=1024),t.memoizedProps=a,t.memoizedState=S),n.props=a,n.state=S,n.context=f,a=b):(typeof n.componentDidUpdate!="function"||c===l.memoizedProps&&h===l.memoizedState||(t.flags|=4),typeof n.getSnapshotBeforeUpdate!="function"||c===l.memoizedProps&&h===l.memoizedState||(t.flags|=1024),a=!1)}return n=a,dn(l,t),a=(t.flags&128)!==0,n||a?(n=t.stateNode,e=a&&typeof e.getDerivedStateFromError!="function"?null:n.render(),t.flags|=1,l!==null&&a?(t.child=qe(t,l.child,null,u),t.child=qe(t,null,e,u)):Vl(l,t,e,u),t.memoizedState=n.state,l=t.child):l=Vt(l,t,u),l}function Y0(l,t,e,a){return Ue(),t.flags|=256,Vl(l,t,e,a),t.child}var oi={dehydrated:null,treeContext:null,retryLane:0,hydrationErrors:null};function mi(l){return{baseLanes:l,cachePool:Ms()}}function yi(l,t,e){return l=l!==null?l.childLanes&~e:0,t&&(l|=yt),l}function G0(l,t,e){var a=t.pendingProps,u=!1,n=(t.flags&128)!==0,c;if((c=n)||(c=l!==null&&l.memoizedState===null?!1:(Nl.current&2)!==0),c&&(u=!0,t.flags&=-129),c=(t.flags&32)!==0,t.flags&=-33,l===null){if(el){if(u?fe(t):se(),(l=Tl)?(l=wd(l,Et),l=l!==null&&l.data!=="&"?l:null,l!==null&&(t.memoizedState={dehydrated:l,treeContext:te!==null?{id:Nt,overflow:Dt}:null,retryLane:536870912,hydrationErrors:null},e=Ss(l),e.return=t,t.child=e,Xl=t,Tl=null)):l=null,l===null)throw ae(t);return $i(l)?t.lanes=32:t.lanes=536870912,null}var i=a.children;return a=a.fallback,u?(se(),u=t.mode,i=on({mode:"hidden",children:i},u),a=De(a,u,e,null),i.return=t,a.return=t,i.sibling=a,t.child=i,a=t.child,a.memoizedState=mi(e),a.childLanes=yi(l,c,e),t.memoizedState=oi,lu(null,a)):(fe(t),ri(t,i))}var f=l.memoizedState;if(f!==null&&(i=f.dehydrated,i!==null)){if(n)t.flags&256?(fe(t),t.flags&=-257,t=hi(l,t,e)):t.memoizedState!==null?(se(),t.child=l.child,t.flags|=128,t=null):(se(),i=a.fallback,u=t.mode,a=on({mode:"visible",children:a.children},u),i=De(i,u,e,null),i.flags|=2,a.return=t,i.return=t,a.sibling=i,t.child=a,qe(t,l.child,null,e),a=t.child,a.memoizedState=mi(e),a.childLanes=yi(l,c,e),t.memoizedState=oi,t=lu(null,a));else if(fe(t),$i(i)){if(c=i.nextSibling&&i.nextSibling.dataset,c)var r=c.dgst;c=r,a=Error(o(419)),a.stack="",a.digest=c,Za({value:a,source:null,stack:null}),t=hi(l,t,e)}else if(Rl||na(l,t,e,!1),c=(e&l.childLanes)!==0,Rl||c){if(c=zl,c!==null&&(a=_f(c,e),a!==0&&a!==f.retryLane))throw f.retryLane=a,Ne(l,a),at(c,l,a),si;Wi(i)||zn(),t=hi(l,t,e)}else Wi(i)?(t.flags|=192,t.child=l.child,t=null):(l=f.treeContext,Tl=At(i.nextSibling),Xl=t,el=!0,ee=null,Et=!1,l!==null&&Es(t,l),t=ri(t,a.children),t.flags|=4096);return t}return u?(se(),i=a.fallback,u=t.mode,f=l.child,r=f.sibling,a=qt(f,{mode:"hidden",children:a.children}),a.subtreeFlags=f.subtreeFlags&65011712,r!==null?i=qt(r,i):(i=De(i,u,e,null),i.flags|=2),i.return=t,a.return=t,a.sibling=i,t.child=a,lu(null,a),a=t.child,i=l.child.memoizedState,i===null?i=mi(e):(u=i.cachePool,u!==null?(f=Ul._currentValue,u=u.parent!==f?{parent:f,pool:f}:u):u=Ms(),i={baseLanes:i.baseLanes|e,cachePool:u}),a.memoizedState=i,a.childLanes=yi(l,c,e),t.memoizedState=oi,lu(l.child,a)):(fe(t),e=l.child,l=e.sibling,e=qt(e,{mode:"visible",children:a.children}),e.return=t,e.sibling=null,l!==null&&(c=t.deletions,c===null?(t.deletions=[l],t.flags|=16):c.push(l)),t.child=e,t.memoizedState=null,e)}function ri(l,t){return t=on({mode:"visible",children:t},l.mode),t.return=l,l.child=t}function on(l,t){return l=st(22,l,null,t),l.lanes=0,l}function hi(l,t,e){return qe(t,l.child,null,e),l=ri(t,t.pendingProps.children),l.flags|=2,t.memoizedState=null,l}function Q0(l,t,e){l.lanes|=t;var a=l.alternate;a!==null&&(a.lanes|=t),Dc(l.return,t,e)}function vi(l,t,e,a,u,n){var c=l.memoizedState;c===null?l.memoizedState={isBackwards:t,rendering:null,renderingStartTime:0,last:a,tail:e,tailMode:u,treeForkCount:n}:(c.isBackwards=t,c.rendering=null,c.renderingStartTime=0,c.last=a,c.tail=e,c.tailMode=u,c.treeForkCount=n)}function X0(l,t,e){var a=t.pendingProps,u=a.revealOrder,n=a.tail;a=a.children;var c=Nl.current,i=(c&2)!==0;if(i?(c=c&1|2,t.flags|=128):c&=1,O(Nl,c),Vl(l,t,a,e),a=el?Xa:0,!i&&l!==null&&(l.flags&128)!==0)l:for(l=t.child;l!==null;){if(l.tag===13)l.memoizedState!==null&&Q0(l,e,t);else if(l.tag===19)Q0(l,e,t);else if(l.child!==null){l.child.return=l,l=l.child;continue}if(l===t)break l;for(;l.sibling===null;){if(l.return===null||l.return===t)break l;l=l.return}l.sibling.return=l.return,l=l.sibling}switch(u){case"forwards":for(e=t.child,u=null;e!==null;)l=e.alternate,l!==null&&Iu(l)===null&&(u=e),e=e.sibling;e=u,e===null?(u=t.child,t.child=null):(u=e.sibling,e.sibling=null),vi(t,!1,u,e,n,a);break;case"backwards":case"unstable_legacy-backwards":for(e=null,u=t.child,t.child=null;u!==null;){if(l=u.alternate,l!==null&&Iu(l)===null){t.child=u;break}l=u.sibling,u.sibling=e,e=u,u=l}vi(t,!0,e,null,n,a);break;case"together":vi(t,!1,null,null,void 0,a);break;default:t.memoizedState=null}return t.child}function Vt(l,t,e){if(l!==null&&(t.dependencies=l.dependencies),me|=t.lanes,(e&t.childLanes)===0)if(l!==null){if(na(l,t,e,!1),(e&t.childLanes)===0)return null}else return null;if(l!==null&&t.child!==l.child)throw Error(o(153));if(t.child!==null){for(l=t.child,e=qt(l,l.pendingProps),t.child=e,e.return=t;l.sibling!==null;)l=l.sibling,e=e.sibling=qt(l,l.pendingProps),e.return=t;e.sibling=null}return t.child}function gi(l,t){return(l.lanes&t)!==0?!0:(l=l.dependencies,!!(l!==null&&Lu(l)))}function ey(l,t,e){switch(t.tag){case 3:wl(t,t.stateNode.containerInfo),ue(t,Ul,l.memoizedState.cache),Ue();break;case 27:case 5:Oa(t);break;case 4:wl(t,t.stateNode.containerInfo);break;case 10:ue(t,t.type,t.memoizedProps.value);break;case 31:if(t.memoizedState!==null)return t.flags|=128,Zc(t),null;break;case 13:var a=t.memoizedState;if(a!==null)return a.dehydrated!==null?(fe(t),t.flags|=128,null):(e&t.child.childLanes)!==0?G0(l,t,e):(fe(t),l=Vt(l,t,e),l!==null?l.sibling:null);fe(t);break;case 19:var u=(l.flags&128)!==0;if(a=(e&t.childLanes)!==0,a||(na(l,t,e,!1),a=(e&t.childLanes)!==0),u){if(a)return X0(l,t,e);t.flags|=128}if(u=t.memoizedState,u!==null&&(u.rendering=null,u.tail=null,u.lastEffect=null),O(Nl,Nl.current),a)break;return null;case 22:return t.lanes=0,R0(l,t,e,t.pendingProps);case 24:ue(t,Ul,l.memoizedState.cache)}return Vt(l,t,e)}function Z0(l,t,e){if(l!==null)if(l.memoizedProps!==t.pendingProps)Rl=!0;else{if(!gi(l,e)&&(t.flags&128)===0)return Rl=!1,ey(l,t,e);Rl=(l.flags&131072)!==0}else Rl=!1,el&&(t.flags&1048576)!==0&&zs(t,Xa,t.index);switch(t.lanes=0,t.tag){case 16:l:{var a=t.pendingProps;if(l=He(t.elementType),t.type=l,typeof l=="function")Tc(l)?(a=Ge(l,a),t.tag=1,t=q0(null,t,l,a,e)):(t.tag=0,t=di(null,t,l,a,e));else{if(l!=null){var u=l.$$typeof;if(u===nl){t.tag=11,t=D0(null,t,l,a,e);break l}else if(u===W){t.tag=14,t=U0(null,t,l,a,e);break l}}throw t=_t(l)||l,Error(o(306,t,""))}}return t;case 0:return di(l,t,t.type,t.pendingProps,e);case 1:return a=t.type,u=Ge(a,t.pendingProps),q0(l,t,a,u,e);case 3:l:{if(wl(t,t.stateNode.containerInfo),l===null)throw Error(o(387));a=t.pendingProps;var n=t.memoizedState;u=n.element,qc(l,t),$a(t,a,null,e);var c=t.memoizedState;if(a=c.cache,ue(t,Ul,a),a!==n.cache&&Uc(t,[Ul],e,!0),Wa(),a=c.element,n.isDehydrated)if(n={element:a,isDehydrated:!1,cache:c.cache},t.updateQueue.baseState=n,t.memoizedState=n,t.flags&256){t=Y0(l,t,a,e);break l}else if(a!==u){u=St(Error(o(424)),t),Za(u),t=Y0(l,t,a,e);break l}else for(l=t.stateNode.containerInfo,l.nodeType===9?l=l.body:l=l.nodeName==="HTML"?l.ownerDocument.body:l,Tl=At(l.firstChild),Xl=t,el=!0,ee=null,Et=!0,e=Rs(t,null,a,e),t.child=e;e;)e.flags=e.flags&-3|4096,e=e.sibling;else{if(Ue(),a===u){t=Vt(l,t,e);break l}Vl(l,t,a,e)}t=t.child}return t;case 26:return dn(l,t),l===null?(e=Pd(t.type,null,t.pendingProps,null))?t.memoizedState=e:el||(e=t.type,l=t.pendingProps,a=Mn(K.current).createElement(e),a[Ql]=t,a[Fl]=l,Ll(a,e,l),ql(a),t.stateNode=a):t.memoizedState=Pd(t.type,l.memoizedProps,t.pendingProps,l.memoizedState),null;case 27:return Oa(t),l===null&&el&&(a=t.stateNode=kd(t.type,t.pendingProps,K.current),Xl=t,Et=!0,u=Tl,ge(t.type)?(ki=u,Tl=At(a.firstChild)):Tl=u),Vl(l,t,t.pendingProps.children,e),dn(l,t),l===null&&(t.flags|=4194304),t.child;case 5:return l===null&&el&&((u=a=Tl)&&(a=Cy(a,t.type,t.pendingProps,Et),a!==null?(t.stateNode=a,Xl=t,Tl=At(a.firstChild),Et=!1,u=!0):u=!1),u||ae(t)),Oa(t),u=t.type,n=t.pendingProps,c=l!==null?l.memoizedProps:null,a=n.children,Ki(u,n)?a=null:c!==null&&Ki(u,c)&&(t.flags|=32),t.memoizedState!==null&&(u=Lc(l,t,wm,null,null,e),hu._currentValue=u),dn(l,t),Vl(l,t,a,e),t.child;case 6:return l===null&&el&&((l=e=Tl)&&(e=Ry(e,t.pendingProps,Et),e!==null?(t.stateNode=e,Xl=t,Tl=null,l=!0):l=!1),l||ae(t)),null;case 13:return G0(l,t,e);case 4:return wl(t,t.stateNode.containerInfo),a=t.pendingProps,l===null?t.child=qe(t,null,a,e):Vl(l,t,a,e),t.child;case 11:return D0(l,t,t.type,t.pendingProps,e);case 7:return Vl(l,t,t.pendingProps,e),t.child;case 8:return Vl(l,t,t.pendingProps.children,e),t.child;case 12:return Vl(l,t,t.pendingProps.children,e),t.child;case 10:return a=t.pendingProps,ue(t,t.type,a.value),Vl(l,t,a.children,e),t.child;case 9:return u=t.type._context,a=t.pendingProps.children,Re(t),u=Zl(u),a=a(u),t.flags|=1,Vl(l,t,a,e),t.child;case 14:return U0(l,t,t.type,t.pendingProps,e);case 15:return C0(l,t,t.type,t.pendingProps,e);case 19:return X0(l,t,e);case 31:return ty(l,t,e);case 22:return R0(l,t,e,t.pendingProps);case 24:return Re(t),a=Zl(Ul),l===null?(u=jc(),u===null&&(u=zl,n=Cc(),u.pooledCache=n,n.refCount++,n!==null&&(u.pooledCacheLanes|=e),u=n),t.memoizedState={parent:a,cache:u},Bc(t),ue(t,Ul,u)):((l.lanes&e)!==0&&(qc(l,t),$a(t,null,null,e),Wa()),u=l.memoizedState,n=t.memoizedState,u.parent!==a?(u={parent:a,cache:a},t.memoizedState=u,t.lanes===0&&(t.memoizedState=t.updateQueue.baseState=u),ue(t,Ul,a)):(a=n.cache,ue(t,Ul,a),a!==u.cache&&Uc(t,[Ul],e,!0))),Vl(l,t,t.pendingProps.children,e),t.child;case 29:throw t.pendingProps}throw Error(o(156,t.tag))}function Lt(l){l.flags|=4}function Si(l,t,e,a,u){if((t=(l.mode&32)!==0)&&(t=!1),t){if(l.flags|=16777216,(u&335544128)===u)if(l.stateNode.complete)l.flags|=8192;else if(hd())l.flags|=8192;else throw Be=Wu,Hc}else l.flags&=-16777217}function V0(l,t){if(t.type!=="stylesheet"||(t.state.loading&4)!==0)l.flags&=-16777217;else if(l.flags|=16777216,!uo(t))if(hd())l.flags|=8192;else throw Be=Wu,Hc}function mn(l,t){t!==null&&(l.flags|=4),l.flags&16384&&(t=l.tag!==22?Tf():536870912,l.lanes|=t,ga|=t)}function tu(l,t){if(!el)switch(l.tailMode){case"hidden":t=l.tail;for(var e=null;t!==null;)t.alternate!==null&&(e=t),t=t.sibling;e===null?l.tail=null:e.sibling=null;break;case"collapsed":e=l.tail;for(var a=null;e!==null;)e.alternate!==null&&(a=e),e=e.sibling;a===null?t||l.tail===null?l.tail=null:l.tail.sibling=null:a.sibling=null}}function Al(l){var t=l.alternate!==null&&l.alternate.child===l.child,e=0,a=0;if(t)for(var u=l.child;u!==null;)e|=u.lanes|u.childLanes,a|=u.subtreeFlags&65011712,a|=u.flags&65011712,u.return=l,u=u.sibling;else for(u=l.child;u!==null;)e|=u.lanes|u.childLanes,a|=u.subtreeFlags,a|=u.flags,u.return=l,u=u.sibling;return l.subtreeFlags|=a,l.childLanes=e,t}function ay(l,t,e){var a=t.pendingProps;switch(Oc(t),t.tag){case 16:case 15:case 0:case 11:case 7:case 8:case 12:case 9:case 14:return Al(t),null;case 1:return Al(t),null;case 3:return e=t.stateNode,a=null,l!==null&&(a=l.memoizedState.cache),t.memoizedState.cache!==a&&(t.flags|=2048),Qt(Ul),xl(),e.pendingContext&&(e.context=e.pendingContext,e.pendingContext=null),(l===null||l.child===null)&&(ua(t)?Lt(t):l===null||l.memoizedState.isDehydrated&&(t.flags&256)===0||(t.flags|=1024,xc())),Al(t),null;case 26:var u=t.type,n=t.memoizedState;return l===null?(Lt(t),n!==null?(Al(t),V0(t,n)):(Al(t),Si(t,u,null,a,e))):n?n!==l.memoizedState?(Lt(t),Al(t),V0(t,n)):(Al(t),t.flags&=-16777217):(l=l.memoizedProps,l!==a&&Lt(t),Al(t),Si(t,u,l,a,e)),null;case 27:if(Tu(t),e=K.current,u=t.type,l!==null&&t.stateNode!=null)l.memoizedProps!==a&&Lt(t);else{if(!a){if(t.stateNode===null)throw Error(o(166));return Al(t),null}l=D.current,ua(t)?Ts(t):(l=kd(u,a,e),t.stateNode=l,Lt(t))}return Al(t),null;case 5:if(Tu(t),u=t.type,l!==null&&t.stateNode!=null)l.memoizedProps!==a&&Lt(t);else{if(!a){if(t.stateNode===null)throw Error(o(166));return Al(t),null}if(n=D.current,ua(t))Ts(t);else{var c=Mn(K.current);switch(n){case 1:n=c.createElementNS("http://www.w3.org/2000/svg",u);break;case 2:n=c.createElementNS("http://www.w3.org/1998/Math/MathML",u);break;default:switch(u){case"svg":n=c.createElementNS("http://www.w3.org/2000/svg",u);break;case"math":n=c.createElementNS("http://www.w3.org/1998/Math/MathML",u);break;case"script":n=c.createElement("div"),n.innerHTML="<script><\/script>",n=n.removeChild(n.firstChild);break;case"select":n=typeof a.is=="string"?c.createElement("select",{is:a.is}):c.createElement("select"),a.multiple?n.multiple=!0:a.size&&(n.size=a.size);break;default:n=typeof a.is=="string"?c.createElement(u,{is:a.is}):c.createElement(u)}}n[Ql]=t,n[Fl]=a;l:for(c=t.child;c!==null;){if(c.tag===5||c.tag===6)n.appendChild(c.stateNode);else if(c.tag!==4&&c.tag!==27&&c.child!==null){c.child.return=c,c=c.child;continue}if(c===t)break l;for(;c.sibling===null;){if(c.return===null||c.return===t)break l;c=c.return}c.sibling.return=c.return,c=c.sibling}t.stateNode=n;l:switch(Ll(n,u,a),u){case"button":case"input":case"select":case"textarea":a=!!a.autoFocus;break l;case"img":a=!0;break l;default:a=!1}a&&Lt(t)}}return Al(t),Si(t,t.type,l===null?null:l.memoizedProps,t.pendingProps,e),null;case 6:if(l&&t.stateNode!=null)l.memoizedProps!==a&&Lt(t);else{if(typeof a!="string"&&t.stateNode===null)throw Error(o(166));if(l=K.current,ua(t)){if(l=t.stateNode,e=t.memoizedProps,a=null,u=Xl,u!==null)switch(u.tag){case 27:case 5:a=u.memoizedProps}l[Ql]=t,l=!!(l.nodeValue===e||a!==null&&a.suppressHydrationWarning===!0||Gd(l.nodeValue,e)),l||ae(t,!0)}else l=Mn(l).createTextNode(a),l[Ql]=t,t.stateNode=l}return Al(t),null;case 31:if(e=t.memoizedState,l===null||l.memoizedState!==null){if(a=ua(t),e!==null){if(l===null){if(!a)throw Error(o(318));if(l=t.memoizedState,l=l!==null?l.dehydrated:null,!l)throw Error(o(557));l[Ql]=t}else Ue(),(t.flags&128)===0&&(t.memoizedState=null),t.flags|=4;Al(t),l=!1}else e=xc(),l!==null&&l.memoizedState!==null&&(l.memoizedState.hydrationErrors=e),l=!0;if(!l)return t.flags&256?(ot(t),t):(ot(t),null);if((t.flags&128)!==0)throw Error(o(558))}return Al(t),null;case 13:if(a=t.memoizedState,l===null||l.memoizedState!==null&&l.memoizedState.dehydrated!==null){if(u=ua(t),a!==null&&a.dehydrated!==null){if(l===null){if(!u)throw Error(o(318));if(u=t.memoizedState,u=u!==null?u.dehydrated:null,!u)throw Error(o(317));u[Ql]=t}else Ue(),(t.flags&128)===0&&(t.memoizedState=null),t.flags|=4;Al(t),u=!1}else u=xc(),l!==null&&l.memoizedState!==null&&(l.memoizedState.hydrationErrors=u),u=!0;if(!u)return t.flags&256?(ot(t),t):(ot(t),null)}return ot(t),(t.flags&128)!==0?(t.lanes=e,t):(e=a!==null,l=l!==null&&l.memoizedState!==null,e&&(a=t.child,u=null,a.alternate!==null&&a.alternate.memoizedState!==null&&a.alternate.memoizedState.cachePool!==null&&(u=a.alternate.memoizedState.cachePool.pool),n=null,a.memoizedState!==null&&a.memoizedState.cachePool!==null&&(n=a.memoizedState.cachePool.pool),n!==u&&(a.flags|=2048)),e!==l&&e&&(t.child.flags|=8192),mn(t,t.updateQueue),Al(t),null);case 4:return xl(),l===null&&Qi(t.stateNode.containerInfo),Al(t),null;case 10:return Qt(t.type),Al(t),null;case 19:if(v(Nl),a=t.memoizedState,a===null)return Al(t),null;if(u=(t.flags&128)!==0,n=a.rendering,n===null)if(u)tu(a,!1);else{if(Ml!==0||l!==null&&(l.flags&128)!==0)for(l=t.child;l!==null;){if(n=Iu(l),n!==null){for(t.flags|=128,tu(a,!1),l=n.updateQueue,t.updateQueue=l,mn(t,l),t.subtreeFlags=0,l=e,e=t.child;e!==null;)gs(e,l),e=e.sibling;return O(Nl,Nl.current&1|2),el&&Yt(t,a.treeForkCount),t.child}l=l.sibling}a.tail!==null&&nt()>gn&&(t.flags|=128,u=!0,tu(a,!1),t.lanes=4194304)}else{if(!u)if(l=Iu(n),l!==null){if(t.flags|=128,u=!0,l=l.updateQueue,t.updateQueue=l,mn(t,l),tu(a,!0),a.tail===null&&a.tailMode==="hidden"&&!n.alternate&&!el)return Al(t),null}else 2*nt()-a.renderingStartTime>gn&&e!==536870912&&(t.flags|=128,u=!0,tu(a,!1),t.lanes=4194304);a.isBackwards?(n.sibling=t.child,t.child=n):(l=a.last,l!==null?l.sibling=n:t.child=n,a.last=n)}return a.tail!==null?(l=a.tail,a.rendering=l,a.tail=l.sibling,a.renderingStartTime=nt(),l.sibling=null,e=Nl.current,O(Nl,u?e&1|2:e&1),el&&Yt(t,a.treeForkCount),l):(Al(t),null);case 22:case 23:return ot(t),Xc(),a=t.memoizedState!==null,l!==null?l.memoizedState!==null!==a&&(t.flags|=8192):a&&(t.flags|=8192),a?(e&536870912)!==0&&(t.flags&128)===0&&(Al(t),t.subtreeFlags&6&&(t.flags|=8192)):Al(t),e=t.updateQueue,e!==null&&mn(t,e.retryQueue),e=null,l!==null&&l.memoizedState!==null&&l.memoizedState.cachePool!==null&&(e=l.memoizedState.cachePool.pool),a=null,t.memoizedState!==null&&t.memoizedState.cachePool!==null&&(a=t.memoizedState.cachePool.pool),a!==e&&(t.flags|=2048),l!==null&&v(je),null;case 24:return e=null,l!==null&&(e=l.memoizedState.cache),t.memoizedState.cache!==e&&(t.flags|=2048),Qt(Ul),Al(t),null;case 25:return null;case 30:return null}throw Error(o(156,t.tag))}function uy(l,t){switch(Oc(t),t.tag){case 1:return l=t.flags,l&65536?(t.flags=l&-65537|128,t):null;case 3:return Qt(Ul),xl(),l=t.flags,(l&65536)!==0&&(l&128)===0?(t.flags=l&-65537|128,t):null;case 26:case 27:case 5:return Tu(t),null;case 31:if(t.memoizedState!==null){if(ot(t),t.alternate===null)throw Error(o(340));Ue()}return l=t.flags,l&65536?(t.flags=l&-65537|128,t):null;case 13:if(ot(t),l=t.memoizedState,l!==null&&l.dehydrated!==null){if(t.alternate===null)throw Error(o(340));Ue()}return l=t.flags,l&65536?(t.flags=l&-65537|128,t):null;case 19:return v(Nl),null;case 4:return xl(),null;case 10:return Qt(t.type),null;case 22:case 23:return ot(t),Xc(),l!==null&&v(je),l=t.flags,l&65536?(t.flags=l&-65537|128,t):null;case 24:return Qt(Ul),null;case 25:return null;default:return null}}function L0(l,t){switch(Oc(t),t.tag){case 3:Qt(Ul),xl();break;case 26:case 27:case 5:Tu(t);break;case 4:xl();break;case 31:t.memoizedState!==null&&ot(t);break;case 13:ot(t);break;case 19:v(Nl);break;case 10:Qt(t.type);break;case 22:case 23:ot(t),Xc(),l!==null&&v(je);break;case 24:Qt(Ul)}}function eu(l,t){try{var e=t.updateQueue,a=e!==null?e.lastEffect:null;if(a!==null){var u=a.next;e=u;do{if((e.tag&l)===l){a=void 0;var n=e.create,c=e.inst;a=n(),c.destroy=a}e=e.next}while(e!==u)}}catch(i){hl(t,t.return,i)}}function de(l,t,e){try{var a=t.updateQueue,u=a!==null?a.lastEffect:null;if(u!==null){var n=u.next;a=n;do{if((a.tag&l)===l){var c=a.inst,i=c.destroy;if(i!==void 0){c.destroy=void 0,u=t;var f=e,r=i;try{r()}catch(b){hl(u,f,b)}}}a=a.next}while(a!==n)}}catch(b){hl(t,t.return,b)}}function K0(l){var t=l.updateQueue;if(t!==null){var e=l.stateNode;try{Hs(t,e)}catch(a){hl(l,l.return,a)}}}function J0(l,t,e){e.props=Ge(l.type,l.memoizedProps),e.state=l.memoizedState;try{e.componentWillUnmount()}catch(a){hl(l,t,a)}}function au(l,t){try{var e=l.ref;if(e!==null){switch(l.tag){case 26:case 27:case 5:var a=l.stateNode;break;case 30:a=l.stateNode;break;default:a=l.stateNode}typeof e=="function"?l.refCleanup=e(a):e.current=a}}catch(u){hl(l,t,u)}}function Ut(l,t){var e=l.ref,a=l.refCleanup;if(e!==null)if(typeof a=="function")try{a()}catch(u){hl(l,t,u)}finally{l.refCleanup=null,l=l.alternate,l!=null&&(l.refCleanup=null)}else if(typeof e=="function")try{e(null)}catch(u){hl(l,t,u)}else e.current=null}function w0(l){var t=l.type,e=l.memoizedProps,a=l.stateNode;try{l:switch(t){case"button":case"input":case"select":case"textarea":e.autoFocus&&a.focus();break l;case"img":e.src?a.src=e.src:e.srcSet&&(a.srcset=e.srcSet)}}catch(u){hl(l,l.return,u)}}function bi(l,t,e){try{var a=l.stateNode;Oy(a,l.type,e,t),a[Fl]=t}catch(u){hl(l,l.return,u)}}function W0(l){return l.tag===5||l.tag===3||l.tag===26||l.tag===27&&ge(l.type)||l.tag===4}function zi(l){l:for(;;){for(;l.sibling===null;){if(l.return===null||W0(l.return))return null;l=l.return}for(l.sibling.return=l.return,l=l.sibling;l.tag!==5&&l.tag!==6&&l.tag!==18;){if(l.tag===27&&ge(l.type)||l.flags&2||l.child===null||l.tag===4)continue l;l.child.return=l,l=l.child}if(!(l.flags&2))return l.stateNode}}function Ei(l,t,e){var a=l.tag;if(a===5||a===6)l=l.stateNode,t?(e.nodeType===9?e.body:e.nodeName==="HTML"?e.ownerDocument.body:e).insertBefore(l,t):(t=e.nodeType===9?e.body:e.nodeName==="HTML"?e.ownerDocument.body:e,t.appendChild(l),e=e._reactRootContainer,e!=null||t.onclick!==null||(t.onclick=Ht));else if(a!==4&&(a===27&&ge(l.type)&&(e=l.stateNode,t=null),l=l.child,l!==null))for(Ei(l,t,e),l=l.sibling;l!==null;)Ei(l,t,e),l=l.sibling}function yn(l,t,e){var a=l.tag;if(a===5||a===6)l=l.stateNode,t?e.insertBefore(l,t):e.appendChild(l);else if(a!==4&&(a===27&&ge(l.type)&&(e=l.stateNode),l=l.child,l!==null))for(yn(l,t,e),l=l.sibling;l!==null;)yn(l,t,e),l=l.sibling}function $0(l){var t=l.stateNode,e=l.memoizedProps;try{for(var a=l.type,u=t.attributes;u.length;)t.removeAttributeNode(u[0]);Ll(t,a,e),t[Ql]=l,t[Fl]=e}catch(n){hl(l,l.return,n)}}var Kt=!1,jl=!1,Ti=!1,k0=typeof WeakSet=="function"?WeakSet:Set,Yl=null;function ny(l,t){if(l=l.containerInfo,Vi=jn,l=fs(l),hc(l)){if("selectionStart"in l)var e={start:l.selectionStart,end:l.selectionEnd};else l:{e=(e=l.ownerDocument)&&e.defaultView||window;var a=e.getSelection&&e.getSelection();if(a&&a.rangeCount!==0){e=a.anchorNode;var u=a.anchorOffset,n=a.focusNode;a=a.focusOffset;try{e.nodeType,n.nodeType}catch{e=null;break l}var c=0,i=-1,f=-1,r=0,b=0,A=l,h=null;t:for(;;){for(var S;A!==e||u!==0&&A.nodeType!==3||(i=c+u),A!==n||a!==0&&A.nodeType!==3||(f=c+a),A.nodeType===3&&(c+=A.nodeValue.length),(S=A.firstChild)!==null;)h=A,A=S;for(;;){if(A===l)break t;if(h===e&&++r===u&&(i=c),h===n&&++b===a&&(f=c),(S=A.nextSibling)!==null)break;A=h,h=A.parentNode}A=S}e=i===-1||f===-1?null:{start:i,end:f}}else e=null}e=e||{start:0,end:0}}else e=null;for(Li={focusedElem:l,selectionRange:e},jn=!1,Yl=t;Yl!==null;)if(t=Yl,l=t.child,(t.subtreeFlags&1028)!==0&&l!==null)l.return=t,Yl=l;else for(;Yl!==null;){switch(t=Yl,n=t.alternate,l=t.flags,t.tag){case 0:if((l&4)!==0&&(l=t.updateQueue,l=l!==null?l.events:null,l!==null))for(e=0;e<l.length;e++)u=l[e],u.ref.impl=u.nextImpl;break;case 11:case 15:break;case 1:if((l&1024)!==0&&n!==null){l=void 0,e=t,u=n.memoizedProps,n=n.memoizedState,a=e.stateNode;try{var j=Ge(e.type,u);l=a.getSnapshotBeforeUpdate(j,n),a.__reactInternalSnapshotBeforeUpdate=l}catch(Z){hl(e,e.return,Z)}}break;case 3:if((l&1024)!==0){if(l=t.stateNode.containerInfo,e=l.nodeType,e===9)wi(l);else if(e===1)switch(l.nodeName){case"HEAD":case"HTML":case"BODY":wi(l);break;default:l.textContent=""}}break;case 5:case 26:case 27:case 6:case 4:case 17:break;default:if((l&1024)!==0)throw Error(o(163))}if(l=t.sibling,l!==null){l.return=t.return,Yl=l;break}Yl=t.return}}function F0(l,t,e){var a=e.flags;switch(e.tag){case 0:case 11:case 15:wt(l,e),a&4&&eu(5,e);break;case 1:if(wt(l,e),a&4)if(l=e.stateNode,t===null)try{l.componentDidMount()}catch(c){hl(e,e.return,c)}else{var u=Ge(e.type,t.memoizedProps);t=t.memoizedState;try{l.componentDidUpdate(u,t,l.__reactInternalSnapshotBeforeUpdate)}catch(c){hl(e,e.return,c)}}a&64&&K0(e),a&512&&au(e,e.return);break;case 3:if(wt(l,e),a&64&&(l=e.updateQueue,l!==null)){if(t=null,e.child!==null)switch(e.child.tag){case 27:case 5:t=e.child.stateNode;break;case 1:t=e.child.stateNode}try{Hs(l,t)}catch(c){hl(e,e.return,c)}}break;case 27:t===null&&a&4&&$0(e);case 26:case 5:wt(l,e),t===null&&a&4&&w0(e),a&512&&au(e,e.return);break;case 12:wt(l,e);break;case 31:wt(l,e),a&4&&ld(l,e);break;case 13:wt(l,e),a&4&&td(l,e),a&64&&(l=e.memoizedState,l!==null&&(l=l.dehydrated,l!==null&&(e=ry.bind(null,e),jy(l,e))));break;case 22:if(a=e.memoizedState!==null||Kt,!a){t=t!==null&&t.memoizedState!==null||jl,u=Kt;var n=jl;Kt=a,(jl=t)&&!n?Wt(l,e,(e.subtreeFlags&8772)!==0):wt(l,e),Kt=u,jl=n}break;case 30:break;default:wt(l,e)}}function I0(l){var t=l.alternate;t!==null&&(l.alternate=null,I0(t)),l.child=null,l.deletions=null,l.sibling=null,l.tag===5&&(t=l.stateNode,t!==null&&In(t)),l.stateNode=null,l.return=null,l.dependencies=null,l.memoizedProps=null,l.memoizedState=null,l.pendingProps=null,l.stateNode=null,l.updateQueue=null}var _l=null,Pl=!1;function Jt(l,t,e){for(e=e.child;e!==null;)P0(l,t,e),e=e.sibling}function P0(l,t,e){if(ct&&typeof ct.onCommitFiberUnmount=="function")try{ct.onCommitFiberUnmount(Ma,e)}catch{}switch(e.tag){case 26:jl||Ut(e,t),Jt(l,t,e),e.memoizedState?e.memoizedState.count--:e.stateNode&&(e=e.stateNode,e.parentNode.removeChild(e));break;case 27:jl||Ut(e,t);var a=_l,u=Pl;ge(e.type)&&(_l=e.stateNode,Pl=!1),Jt(l,t,e),mu(e.stateNode),_l=a,Pl=u;break;case 5:jl||Ut(e,t);case 6:if(a=_l,u=Pl,_l=null,Jt(l,t,e),_l=a,Pl=u,_l!==null)if(Pl)try{(_l.nodeType===9?_l.body:_l.nodeName==="HTML"?_l.ownerDocument.body:_l).removeChild(e.stateNode)}catch(n){hl(e,t,n)}else try{_l.removeChild(e.stateNode)}catch(n){hl(e,t,n)}break;case 18:_l!==null&&(Pl?(l=_l,Kd(l.nodeType===9?l.body:l.nodeName==="HTML"?l.ownerDocument.body:l,e.stateNode),_a(l)):Kd(_l,e.stateNode));break;case 4:a=_l,u=Pl,_l=e.stateNode.containerInfo,Pl=!0,Jt(l,t,e),_l=a,Pl=u;break;case 0:case 11:case 14:case 15:de(2,e,t),jl||de(4,e,t),Jt(l,t,e);break;case 1:jl||(Ut(e,t),a=e.stateNode,typeof a.componentWillUnmount=="function"&&J0(e,t,a)),Jt(l,t,e);break;case 21:Jt(l,t,e);break;case 22:jl=(a=jl)||e.memoizedState!==null,Jt(l,t,e),jl=a;break;default:Jt(l,t,e)}}function ld(l,t){if(t.memoizedState===null&&(l=t.alternate,l!==null&&(l=l.memoizedState,l!==null))){l=l.dehydrated;try{_a(l)}catch(e){hl(t,t.return,e)}}}function td(l,t){if(t.memoizedState===null&&(l=t.alternate,l!==null&&(l=l.memoizedState,l!==null&&(l=l.dehydrated,l!==null))))try{_a(l)}catch(e){hl(t,t.return,e)}}function cy(l){switch(l.tag){case 31:case 13:case 19:var t=l.stateNode;return t===null&&(t=l.stateNode=new k0),t;case 22:return l=l.stateNode,t=l._retryCache,t===null&&(t=l._retryCache=new k0),t;default:throw Error(o(435,l.tag))}}function rn(l,t){var e=cy(l);t.forEach(function(a){if(!e.has(a)){e.add(a);var u=hy.bind(null,l,a);a.then(u,u)}})}function lt(l,t){var e=t.deletions;if(e!==null)for(var a=0;a<e.length;a++){var u=e[a],n=l,c=t,i=c;l:for(;i!==null;){switch(i.tag){case 27:if(ge(i.type)){_l=i.stateNode,Pl=!1;break l}break;case 5:_l=i.stateNode,Pl=!1;break l;case 3:case 4:_l=i.stateNode.containerInfo,Pl=!0;break l}i=i.return}if(_l===null)throw Error(o(160));P0(n,c,u),_l=null,Pl=!1,n=u.alternate,n!==null&&(n.return=null),u.return=null}if(t.subtreeFlags&13886)for(t=t.child;t!==null;)ed(t,l),t=t.sibling}var Mt=null;function ed(l,t){var e=l.alternate,a=l.flags;switch(l.tag){case 0:case 11:case 14:case 15:lt(t,l),tt(l),a&4&&(de(3,l,l.return),eu(3,l),de(5,l,l.return));break;case 1:lt(t,l),tt(l),a&512&&(jl||e===null||Ut(e,e.return)),a&64&&Kt&&(l=l.updateQueue,l!==null&&(a=l.callbacks,a!==null&&(e=l.shared.hiddenCallbacks,l.shared.hiddenCallbacks=e===null?a:e.concat(a))));break;case 26:var u=Mt;if(lt(t,l),tt(l),a&512&&(jl||e===null||Ut(e,e.return)),a&4){var n=e!==null?e.memoizedState:null;if(a=l.memoizedState,e===null)if(a===null)if(l.stateNode===null){l:{a=l.type,e=l.memoizedProps,u=u.ownerDocument||u;t:switch(a){case"title":n=u.getElementsByTagName("title")[0],(!n||n[Da]||n[Ql]||n.namespaceURI==="http://www.w3.org/2000/svg"||n.hasAttribute("itemprop"))&&(n=u.createElement(a),u.head.insertBefore(n,u.querySelector("head > title"))),Ll(n,a,e),n[Ql]=l,ql(n),a=n;break l;case"link":var c=eo("link","href",u).get(a+(e.href||""));if(c){for(var i=0;i<c.length;i++)if(n=c[i],n.getAttribute("href")===(e.href==null||e.href===""?null:e.href)&&n.getAttribute("rel")===(e.rel==null?null:e.rel)&&n.getAttribute("title")===(e.title==null?null:e.title)&&n.getAttribute("crossorigin")===(e.crossOrigin==null?null:e.crossOrigin)){c.splice(i,1);break t}}n=u.createElement(a),Ll(n,a,e),u.head.appendChild(n);break;case"meta":if(c=eo("meta","content",u).get(a+(e.content||""))){for(i=0;i<c.length;i++)if(n=c[i],n.getAttribute("content")===(e.content==null?null:""+e.content)&&n.getAttribute("name")===(e.name==null?null:e.name)&&n.getAttribute("property")===(e.property==null?null:e.property)&&n.getAttribute("http-equiv")===(e.httpEquiv==null?null:e.httpEquiv)&&n.getAttribute("charset")===(e.charSet==null?null:e.charSet)){c.splice(i,1);break t}}n=u.createElement(a),Ll(n,a,e),u.head.appendChild(n);break;default:throw Error(o(468,a))}n[Ql]=l,ql(n),a=n}l.stateNode=a}else ao(u,l.type,l.stateNode);else l.stateNode=to(u,a,l.memoizedProps);else n!==a?(n===null?e.stateNode!==null&&(e=e.stateNode,e.parentNode.removeChild(e)):n.count--,a===null?ao(u,l.type,l.stateNode):to(u,a,l.memoizedProps)):a===null&&l.stateNode!==null&&bi(l,l.memoizedProps,e.memoizedProps)}break;case 27:lt(t,l),tt(l),a&512&&(jl||e===null||Ut(e,e.return)),e!==null&&a&4&&bi(l,l.memoizedProps,e.memoizedProps);break;case 5:if(lt(t,l),tt(l),a&512&&(jl||e===null||Ut(e,e.return)),l.flags&32){u=l.stateNode;try{We(u,"")}catch(j){hl(l,l.return,j)}}a&4&&l.stateNode!=null&&(u=l.memoizedProps,bi(l,u,e!==null?e.memoizedProps:u)),a&1024&&(Ti=!0);break;case 6:if(lt(t,l),tt(l),a&4){if(l.stateNode===null)throw Error(o(162));a=l.memoizedProps,e=l.stateNode;try{e.nodeValue=a}catch(j){hl(l,l.return,j)}}break;case 3:if(Dn=null,u=Mt,Mt=xn(t.containerInfo),lt(t,l),Mt=u,tt(l),a&4&&e!==null&&e.memoizedState.isDehydrated)try{_a(t.containerInfo)}catch(j){hl(l,l.return,j)}Ti&&(Ti=!1,ad(l));break;case 4:a=Mt,Mt=xn(l.stateNode.containerInfo),lt(t,l),tt(l),Mt=a;break;case 12:lt(t,l),tt(l);break;case 31:lt(t,l),tt(l),a&4&&(a=l.updateQueue,a!==null&&(l.updateQueue=null,rn(l,a)));break;case 13:lt(t,l),tt(l),l.child.flags&8192&&l.memoizedState!==null!=(e!==null&&e.memoizedState!==null)&&(vn=nt()),a&4&&(a=l.updateQueue,a!==null&&(l.updateQueue=null,rn(l,a)));break;case 22:u=l.memoizedState!==null;var f=e!==null&&e.memoizedState!==null,r=Kt,b=jl;if(Kt=r||u,jl=b||f,lt(t,l),jl=b,Kt=r,tt(l),a&8192)l:for(t=l.stateNode,t._visibility=u?t._visibility&-2:t._visibility|1,u&&(e===null||f||Kt||jl||Qe(l)),e=null,t=l;;){if(t.tag===5||t.tag===26){if(e===null){f=e=t;try{if(n=f.stateNode,u)c=n.style,typeof c.setProperty=="function"?c.setProperty("display","none","important"):c.display="none";else{i=f.stateNode;var A=f.memoizedProps.style,h=A!=null&&A.hasOwnProperty("display")?A.display:null;i.style.display=h==null||typeof h=="boolean"?"":(""+h).trim()}}catch(j){hl(f,f.return,j)}}}else if(t.tag===6){if(e===null){f=t;try{f.stateNode.nodeValue=u?"":f.memoizedProps}catch(j){hl(f,f.return,j)}}}else if(t.tag===18){if(e===null){f=t;try{var S=f.stateNode;u?Jd(S,!0):Jd(f.stateNode,!1)}catch(j){hl(f,f.return,j)}}}else if((t.tag!==22&&t.tag!==23||t.memoizedState===null||t===l)&&t.child!==null){t.child.return=t,t=t.child;continue}if(t===l)break l;for(;t.sibling===null;){if(t.return===null||t.return===l)break l;e===t&&(e=null),t=t.return}e===t&&(e=null),t.sibling.return=t.return,t=t.sibling}a&4&&(a=l.updateQueue,a!==null&&(e=a.retryQueue,e!==null&&(a.retryQueue=null,rn(l,e))));break;case 19:lt(t,l),tt(l),a&4&&(a=l.updateQueue,a!==null&&(l.updateQueue=null,rn(l,a)));break;case 30:break;case 21:break;default:lt(t,l),tt(l)}}function tt(l){var t=l.flags;if(t&2){try{for(var e,a=l.return;a!==null;){if(W0(a)){e=a;break}a=a.return}if(e==null)throw Error(o(160));switch(e.tag){case 27:var u=e.stateNode,n=zi(l);yn(l,n,u);break;case 5:var c=e.stateNode;e.flags&32&&(We(c,""),e.flags&=-33);var i=zi(l);yn(l,i,c);break;case 3:case 4:var f=e.stateNode.containerInfo,r=zi(l);Ei(l,r,f);break;default:throw Error(o(161))}}catch(b){hl(l,l.return,b)}l.flags&=-3}t&4096&&(l.flags&=-4097)}function ad(l){if(l.subtreeFlags&1024)for(l=l.child;l!==null;){var t=l;ad(t),t.tag===5&&t.flags&1024&&t.stateNode.reset(),l=l.sibling}}function wt(l,t){if(t.subtreeFlags&8772)for(t=t.child;t!==null;)F0(l,t.alternate,t),t=t.sibling}function Qe(l){for(l=l.child;l!==null;){var t=l;switch(t.tag){case 0:case 11:case 14:case 15:de(4,t,t.return),Qe(t);break;case 1:Ut(t,t.return);var e=t.stateNode;typeof e.componentWillUnmount=="function"&&J0(t,t.return,e),Qe(t);break;case 27:mu(t.stateNode);case 26:case 5:Ut(t,t.return),Qe(t);break;case 22:t.memoizedState===null&&Qe(t);break;case 30:Qe(t);break;default:Qe(t)}l=l.sibling}}function Wt(l,t,e){for(e=e&&(t.subtreeFlags&8772)!==0,t=t.child;t!==null;){var a=t.alternate,u=l,n=t,c=n.flags;switch(n.tag){case 0:case 11:case 15:Wt(u,n,e),eu(4,n);break;case 1:if(Wt(u,n,e),a=n,u=a.stateNode,typeof u.componentDidMount=="function")try{u.componentDidMount()}catch(r){hl(a,a.return,r)}if(a=n,u=a.updateQueue,u!==null){var i=a.stateNode;try{var f=u.shared.hiddenCallbacks;if(f!==null)for(u.shared.hiddenCallbacks=null,u=0;u<f.length;u++)js(f[u],i)}catch(r){hl(a,a.return,r)}}e&&c&64&&K0(n),au(n,n.return);break;case 27:$0(n);case 26:case 5:Wt(u,n,e),e&&a===null&&c&4&&w0(n),au(n,n.return);break;case 12:Wt(u,n,e);break;case 31:Wt(u,n,e),e&&c&4&&ld(u,n);break;case 13:Wt(u,n,e),e&&c&4&&td(u,n);break;case 22:n.memoizedState===null&&Wt(u,n,e),au(n,n.return);break;case 30:break;default:Wt(u,n,e)}t=t.sibling}}function Ai(l,t){var e=null;l!==null&&l.memoizedState!==null&&l.memoizedState.cachePool!==null&&(e=l.memoizedState.cachePool.pool),l=null,t.memoizedState!==null&&t.memoizedState.cachePool!==null&&(l=t.memoizedState.cachePool.pool),l!==e&&(l!=null&&l.refCount++,e!=null&&Va(e))}function pi(l,t){l=null,t.alternate!==null&&(l=t.alternate.memoizedState.cache),t=t.memoizedState.cache,t!==l&&(t.refCount++,l!=null&&Va(l))}function xt(l,t,e,a){if(t.subtreeFlags&10256)for(t=t.child;t!==null;)ud(l,t,e,a),t=t.sibling}function ud(l,t,e,a){var u=t.flags;switch(t.tag){case 0:case 11:case 15:xt(l,t,e,a),u&2048&&eu(9,t);break;case 1:xt(l,t,e,a);break;case 3:xt(l,t,e,a),u&2048&&(l=null,t.alternate!==null&&(l=t.alternate.memoizedState.cache),t=t.memoizedState.cache,t!==l&&(t.refCount++,l!=null&&Va(l)));break;case 12:if(u&2048){xt(l,t,e,a),l=t.stateNode;try{var n=t.memoizedProps,c=n.id,i=n.onPostCommit;typeof i=="function"&&i(c,t.alternate===null?"mount":"update",l.passiveEffectDuration,-0)}catch(f){hl(t,t.return,f)}}else xt(l,t,e,a);break;case 31:xt(l,t,e,a);break;case 13:xt(l,t,e,a);break;case 23:break;case 22:n=t.stateNode,c=t.alternate,t.memoizedState!==null?n._visibility&2?xt(l,t,e,a):uu(l,t):n._visibility&2?xt(l,t,e,a):(n._visibility|=2,ra(l,t,e,a,(t.subtreeFlags&10256)!==0||!1)),u&2048&&Ai(c,t);break;case 24:xt(l,t,e,a),u&2048&&pi(t.alternate,t);break;default:xt(l,t,e,a)}}function ra(l,t,e,a,u){for(u=u&&((t.subtreeFlags&10256)!==0||!1),t=t.child;t!==null;){var n=l,c=t,i=e,f=a,r=c.flags;switch(c.tag){case 0:case 11:case 15:ra(n,c,i,f,u),eu(8,c);break;case 23:break;case 22:var b=c.stateNode;c.memoizedState!==null?b._visibility&2?ra(n,c,i,f,u):uu(n,c):(b._visibility|=2,ra(n,c,i,f,u)),u&&r&2048&&Ai(c.alternate,c);break;case 24:ra(n,c,i,f,u),u&&r&2048&&pi(c.alternate,c);break;default:ra(n,c,i,f,u)}t=t.sibling}}function uu(l,t){if(t.subtreeFlags&10256)for(t=t.child;t!==null;){var e=l,a=t,u=a.flags;switch(a.tag){case 22:uu(e,a),u&2048&&Ai(a.alternate,a);break;case 24:uu(e,a),u&2048&&pi(a.alternate,a);break;default:uu(e,a)}t=t.sibling}}var nu=8192;function ha(l,t,e){if(l.subtreeFlags&nu)for(l=l.child;l!==null;)nd(l,t,e),l=l.sibling}function nd(l,t,e){switch(l.tag){case 26:ha(l,t,e),l.flags&nu&&l.memoizedState!==null&&Jy(e,Mt,l.memoizedState,l.memoizedProps);break;case 5:ha(l,t,e);break;case 3:case 4:var a=Mt;Mt=xn(l.stateNode.containerInfo),ha(l,t,e),Mt=a;break;case 22:l.memoizedState===null&&(a=l.alternate,a!==null&&a.memoizedState!==null?(a=nu,nu=16777216,ha(l,t,e),nu=a):ha(l,t,e));break;default:ha(l,t,e)}}function cd(l){var t=l.alternate;if(t!==null&&(l=t.child,l!==null)){t.child=null;do t=l.sibling,l.sibling=null,l=t;while(l!==null)}}function cu(l){var t=l.deletions;if((l.flags&16)!==0){if(t!==null)for(var e=0;e<t.length;e++){var a=t[e];Yl=a,fd(a,l)}cd(l)}if(l.subtreeFlags&10256)for(l=l.child;l!==null;)id(l),l=l.sibling}function id(l){switch(l.tag){case 0:case 11:case 15:cu(l),l.flags&2048&&de(9,l,l.return);break;case 3:cu(l);break;case 12:cu(l);break;case 22:var t=l.stateNode;l.memoizedState!==null&&t._visibility&2&&(l.return===null||l.return.tag!==13)?(t._visibility&=-3,hn(l)):cu(l);break;default:cu(l)}}function hn(l){var t=l.deletions;if((l.flags&16)!==0){if(t!==null)for(var e=0;e<t.length;e++){var a=t[e];Yl=a,fd(a,l)}cd(l)}for(l=l.child;l!==null;){switch(t=l,t.tag){case 0:case 11:case 15:de(8,t,t.return),hn(t);break;case 22:e=t.stateNode,e._visibility&2&&(e._visibility&=-3,hn(t));break;default:hn(t)}l=l.sibling}}function fd(l,t){for(;Yl!==null;){var e=Yl;switch(e.tag){case 0:case 11:case 15:de(8,e,t);break;case 23:case 22:if(e.memoizedState!==null&&e.memoizedState.cachePool!==null){var a=e.memoizedState.cachePool.pool;a!=null&&a.refCount++}break;case 24:Va(e.memoizedState.cache)}if(a=e.child,a!==null)a.return=e,Yl=a;else l:for(e=l;Yl!==null;){a=Yl;var u=a.sibling,n=a.return;if(I0(a),a===e){Yl=null;break l}if(u!==null){u.return=n,Yl=u;break l}Yl=n}}}var iy={getCacheForType:function(l){var t=Zl(Ul),e=t.data.get(l);return e===void 0&&(e=l(),t.data.set(l,e)),e},cacheSignal:function(){return Zl(Ul).controller.signal}},fy=typeof WeakMap=="function"?WeakMap:Map,sl=0,zl=null,k=null,P=0,rl=0,mt=null,oe=!1,va=!1,_i=!1,$t=0,Ml=0,me=0,Xe=0,Oi=0,yt=0,ga=0,iu=null,et=null,Mi=!1,vn=0,sd=0,gn=1/0,Sn=null,ye=null,Bl=0,re=null,Sa=null,kt=0,xi=0,Ni=null,dd=null,fu=0,Di=null;function rt(){return(sl&2)!==0&&P!==0?P&-P:z.T!==null?Bi():Of()}function od(){if(yt===0)if((P&536870912)===0||el){var l=_u;_u<<=1,(_u&3932160)===0&&(_u=262144),yt=l}else yt=536870912;return l=dt.current,l!==null&&(l.flags|=32),yt}function at(l,t,e){(l===zl&&(rl===2||rl===9)||l.cancelPendingCommit!==null)&&(ba(l,0),he(l,P,yt,!1)),Na(l,e),((sl&2)===0||l!==zl)&&(l===zl&&((sl&2)===0&&(Xe|=e),Ml===4&&he(l,P,yt,!1)),Ct(l))}function md(l,t,e){if((sl&6)!==0)throw Error(o(327));var a=!e&&(t&127)===0&&(t&l.expiredLanes)===0||xa(l,t),u=a?oy(l,t):Ci(l,t,!0),n=a;do{if(u===0){va&&!a&&he(l,t,0,!1);break}else{if(e=l.current.alternate,n&&!sy(e)){u=Ci(l,t,!1),n=!1;continue}if(u===2){if(n=t,l.errorRecoveryDisabledLanes&n)var c=0;else c=l.pendingLanes&-536870913,c=c!==0?c:c&536870912?536870912:0;if(c!==0){t=c;l:{var i=l;u=iu;var f=i.current.memoizedState.isDehydrated;if(f&&(ba(i,c).flags|=256),c=Ci(i,c,!1),c!==2){if(_i&&!f){i.errorRecoveryDisabledLanes|=n,Xe|=n,u=4;break l}n=et,et=u,n!==null&&(et===null?et=n:et.push.apply(et,n))}u=c}if(n=!1,u!==2)continue}}if(u===1){ba(l,0),he(l,t,0,!0);break}l:{switch(a=l,n=u,n){case 0:case 1:throw Error(o(345));case 4:if((t&4194048)!==t)break;case 6:he(a,t,yt,!oe);break l;case 2:et=null;break;case 3:case 5:break;default:throw Error(o(329))}if((t&62914560)===t&&(u=vn+300-nt(),10<u)){if(he(a,t,yt,!oe),Mu(a,0,!0)!==0)break l;kt=t,a.timeoutHandle=Vd(yd.bind(null,a,e,et,Sn,Mi,t,yt,Xe,ga,oe,n,"Throttled",-0,0),u);break l}yd(a,e,et,Sn,Mi,t,yt,Xe,ga,oe,n,null,-0,0)}}break}while(!0);Ct(l)}function yd(l,t,e,a,u,n,c,i,f,r,b,A,h,S){if(l.timeoutHandle=-1,A=t.subtreeFlags,A&8192||(A&16785408)===16785408){A={stylesheets:null,count:0,imgCount:0,imgBytes:0,suspenseyImages:[],waitingForImages:!0,waitingForViewTransition:!1,unsuspend:Ht},nd(t,n,A);var j=(n&62914560)===n?vn-nt():(n&4194048)===n?sd-nt():0;if(j=wy(A,j),j!==null){kt=n,l.cancelPendingCommit=j(Ed.bind(null,l,t,n,e,a,u,c,i,f,b,A,null,h,S)),he(l,n,c,!r);return}}Ed(l,t,n,e,a,u,c,i,f)}function sy(l){for(var t=l;;){var e=t.tag;if((e===0||e===11||e===15)&&t.flags&16384&&(e=t.updateQueue,e!==null&&(e=e.stores,e!==null)))for(var a=0;a<e.length;a++){var u=e[a],n=u.getSnapshot;u=u.value;try{if(!ft(n(),u))return!1}catch{return!1}}if(e=t.child,t.subtreeFlags&16384&&e!==null)e.return=t,t=e;else{if(t===l)break;for(;t.sibling===null;){if(t.return===null||t.return===l)return!0;t=t.return}t.sibling.return=t.return,t=t.sibling}}return!0}function he(l,t,e,a){t&=~Oi,t&=~Xe,l.suspendedLanes|=t,l.pingedLanes&=~t,a&&(l.warmLanes|=t),a=l.expirationTimes;for(var u=t;0<u;){var n=31-it(u),c=1<<n;a[n]=-1,u&=~c}e!==0&&Af(l,e,t)}function bn(){return(sl&6)===0?(su(0),!1):!0}function Ui(){if(k!==null){if(rl===0)var l=k.return;else l=k,Gt=Ce=null,wc(l),sa=null,Ka=0,l=k;for(;l!==null;)L0(l.alternate,l),l=l.return;k=null}}function ba(l,t){var e=l.timeoutHandle;e!==-1&&(l.timeoutHandle=-1,Ny(e)),e=l.cancelPendingCommit,e!==null&&(l.cancelPendingCommit=null,e()),kt=0,Ui(),zl=l,k=e=qt(l.current,null),P=t,rl=0,mt=null,oe=!1,va=xa(l,t),_i=!1,ga=yt=Oi=Xe=me=Ml=0,et=iu=null,Mi=!1,(t&8)!==0&&(t|=t&32);var a=l.entangledLanes;if(a!==0)for(l=l.entanglements,a&=t;0<a;){var u=31-it(a),n=1<<u;t|=l[u],a&=~n}return $t=t,Gu(),e}function rd(l,t){J=null,z.H=Pa,t===fa||t===wu?(t=Ds(),rl=3):t===Hc?(t=Ds(),rl=4):rl=t===si?8:t!==null&&typeof t=="object"&&typeof t.then=="function"?6:1,mt=t,k===null&&(Ml=1,fn(l,St(t,l.current)))}function hd(){var l=dt.current;return l===null?!0:(P&4194048)===P?Tt===null:(P&62914560)===P||(P&536870912)!==0?l===Tt:!1}function vd(){var l=z.H;return z.H=Pa,l===null?Pa:l}function gd(){var l=z.A;return z.A=iy,l}function zn(){Ml=4,oe||(P&4194048)!==P&&dt.current!==null||(va=!0),(me&134217727)===0&&(Xe&134217727)===0||zl===null||he(zl,P,yt,!1)}function Ci(l,t,e){var a=sl;sl|=2;var u=vd(),n=gd();(zl!==l||P!==t)&&(Sn=null,ba(l,t)),t=!1;var c=Ml;l:do try{if(rl!==0&&k!==null){var i=k,f=mt;switch(rl){case 8:Ui(),c=6;break l;case 3:case 2:case 9:case 6:dt.current===null&&(t=!0);var r=rl;if(rl=0,mt=null,za(l,i,f,r),e&&va){c=0;break l}break;default:r=rl,rl=0,mt=null,za(l,i,f,r)}}dy(),c=Ml;break}catch(b){rd(l,b)}while(!0);return t&&l.shellSuspendCounter++,Gt=Ce=null,sl=a,z.H=u,z.A=n,k===null&&(zl=null,P=0,Gu()),c}function dy(){for(;k!==null;)Sd(k)}function oy(l,t){var e=sl;sl|=2;var a=vd(),u=gd();zl!==l||P!==t?(Sn=null,gn=nt()+500,ba(l,t)):va=xa(l,t);l:do try{if(rl!==0&&k!==null){t=k;var n=mt;t:switch(rl){case 1:rl=0,mt=null,za(l,t,n,1);break;case 2:case 9:if(xs(n)){rl=0,mt=null,bd(t);break}t=function(){rl!==2&&rl!==9||zl!==l||(rl=7),Ct(l)},n.then(t,t);break l;case 3:rl=7;break l;case 4:rl=5;break l;case 7:xs(n)?(rl=0,mt=null,bd(t)):(rl=0,mt=null,za(l,t,n,7));break;case 5:var c=null;switch(k.tag){case 26:c=k.memoizedState;case 5:case 27:var i=k;if(c?uo(c):i.stateNode.complete){rl=0,mt=null;var f=i.sibling;if(f!==null)k=f;else{var r=i.return;r!==null?(k=r,En(r)):k=null}break t}}rl=0,mt=null,za(l,t,n,5);break;case 6:rl=0,mt=null,za(l,t,n,6);break;case 8:Ui(),Ml=6;break l;default:throw Error(o(462))}}my();break}catch(b){rd(l,b)}while(!0);return Gt=Ce=null,z.H=a,z.A=u,sl=e,k!==null?0:(zl=null,P=0,Gu(),Ml)}function my(){for(;k!==null&&!Ho();)Sd(k)}function Sd(l){var t=Z0(l.alternate,l,$t);l.memoizedProps=l.pendingProps,t===null?En(l):k=t}function bd(l){var t=l,e=t.alternate;switch(t.tag){case 15:case 0:t=B0(e,t,t.pendingProps,t.type,void 0,P);break;case 11:t=B0(e,t,t.pendingProps,t.type.render,t.ref,P);break;case 5:wc(t);default:L0(e,t),t=k=gs(t,$t),t=Z0(e,t,$t)}l.memoizedProps=l.pendingProps,t===null?En(l):k=t}function za(l,t,e,a){Gt=Ce=null,wc(t),sa=null,Ka=0;var u=t.return;try{if(ly(l,u,t,e,P)){Ml=1,fn(l,St(e,l.current)),k=null;return}}catch(n){if(u!==null)throw k=u,n;Ml=1,fn(l,St(e,l.current)),k=null;return}t.flags&32768?(el||a===1?l=!0:va||(P&536870912)!==0?l=!1:(oe=l=!0,(a===2||a===9||a===3||a===6)&&(a=dt.current,a!==null&&a.tag===13&&(a.flags|=16384))),zd(t,l)):En(t)}function En(l){var t=l;do{if((t.flags&32768)!==0){zd(t,oe);return}l=t.return;var e=ay(t.alternate,t,$t);if(e!==null){k=e;return}if(t=t.sibling,t!==null){k=t;return}k=t=l}while(t!==null);Ml===0&&(Ml=5)}function zd(l,t){do{var e=uy(l.alternate,l);if(e!==null){e.flags&=32767,k=e;return}if(e=l.return,e!==null&&(e.flags|=32768,e.subtreeFlags=0,e.deletions=null),!t&&(l=l.sibling,l!==null)){k=l;return}k=l=e}while(l!==null);Ml=6,k=null}function Ed(l,t,e,a,u,n,c,i,f){l.cancelPendingCommit=null;do Tn();while(Bl!==0);if((sl&6)!==0)throw Error(o(327));if(t!==null){if(t===l.current)throw Error(o(177));if(n=t.lanes|t.childLanes,n|=zc,Ko(l,e,n,c,i,f),l===zl&&(k=zl=null,P=0),Sa=t,re=l,kt=e,xi=n,Ni=u,dd=a,(t.subtreeFlags&10256)!==0||(t.flags&10256)!==0?(l.callbackNode=null,l.callbackPriority=0,vy(Au,function(){return Od(),null})):(l.callbackNode=null,l.callbackPriority=0),a=(t.flags&13878)!==0,(t.subtreeFlags&13878)!==0||a){a=z.T,z.T=null,u=N.p,N.p=2,c=sl,sl|=4;try{ny(l,t,e)}finally{sl=c,N.p=u,z.T=a}}Bl=1,Td(),Ad(),pd()}}function Td(){if(Bl===1){Bl=0;var l=re,t=Sa,e=(t.flags&13878)!==0;if((t.subtreeFlags&13878)!==0||e){e=z.T,z.T=null;var a=N.p;N.p=2;var u=sl;sl|=4;try{ed(t,l);var n=Li,c=fs(l.containerInfo),i=n.focusedElem,f=n.selectionRange;if(c!==i&&i&&i.ownerDocument&&is(i.ownerDocument.documentElement,i)){if(f!==null&&hc(i)){var r=f.start,b=f.end;if(b===void 0&&(b=r),"selectionStart"in i)i.selectionStart=r,i.selectionEnd=Math.min(b,i.value.length);else{var A=i.ownerDocument||document,h=A&&A.defaultView||window;if(h.getSelection){var S=h.getSelection(),j=i.textContent.length,Z=Math.min(f.start,j),bl=f.end===void 0?Z:Math.min(f.end,j);!S.extend&&Z>bl&&(c=bl,bl=Z,Z=c);var m=cs(i,Z),s=cs(i,bl);if(m&&s&&(S.rangeCount!==1||S.anchorNode!==m.node||S.anchorOffset!==m.offset||S.focusNode!==s.node||S.focusOffset!==s.offset)){var y=A.createRange();y.setStart(m.node,m.offset),S.removeAllRanges(),Z>bl?(S.addRange(y),S.extend(s.node,s.offset)):(y.setEnd(s.node,s.offset),S.addRange(y))}}}}for(A=[],S=i;S=S.parentNode;)S.nodeType===1&&A.push({element:S,left:S.scrollLeft,top:S.scrollTop});for(typeof i.focus=="function"&&i.focus(),i=0;i<A.length;i++){var T=A[i];T.element.scrollLeft=T.left,T.element.scrollTop=T.top}}jn=!!Vi,Li=Vi=null}finally{sl=u,N.p=a,z.T=e}}l.current=t,Bl=2}}function Ad(){if(Bl===2){Bl=0;var l=re,t=Sa,e=(t.flags&8772)!==0;if((t.subtreeFlags&8772)!==0||e){e=z.T,z.T=null;var a=N.p;N.p=2;var u=sl;sl|=4;try{F0(l,t.alternate,t)}finally{sl=u,N.p=a,z.T=e}}Bl=3}}function pd(){if(Bl===4||Bl===3){Bl=0,Bo();var l=re,t=Sa,e=kt,a=dd;(t.subtreeFlags&10256)!==0||(t.flags&10256)!==0?Bl=5:(Bl=0,Sa=re=null,_d(l,l.pendingLanes));var u=l.pendingLanes;if(u===0&&(ye=null),kn(e),t=t.stateNode,ct&&typeof ct.onCommitFiberRoot=="function")try{ct.onCommitFiberRoot(Ma,t,void 0,(t.current.flags&128)===128)}catch{}if(a!==null){t=z.T,u=N.p,N.p=2,z.T=null;try{for(var n=l.onRecoverableError,c=0;c<a.length;c++){var i=a[c];n(i.value,{componentStack:i.stack})}}finally{z.T=t,N.p=u}}(kt&3)!==0&&Tn(),Ct(l),u=l.pendingLanes,(e&261930)!==0&&(u&42)!==0?l===Di?fu++:(fu=0,Di=l):fu=0,su(0)}}function _d(l,t){(l.pooledCacheLanes&=t)===0&&(t=l.pooledCache,t!=null&&(l.pooledCache=null,Va(t)))}function Tn(){return Td(),Ad(),pd(),Od()}function Od(){if(Bl!==5)return!1;var l=re,t=xi;xi=0;var e=kn(kt),a=z.T,u=N.p;try{N.p=32>e?32:e,z.T=null,e=Ni,Ni=null;var n=re,c=kt;if(Bl=0,Sa=re=null,kt=0,(sl&6)!==0)throw Error(o(331));var i=sl;if(sl|=4,id(n.current),ud(n,n.current,c,e),sl=i,su(0,!1),ct&&typeof ct.onPostCommitFiberRoot=="function")try{ct.onPostCommitFiberRoot(Ma,n)}catch{}return!0}finally{N.p=u,z.T=a,_d(l,t)}}function Md(l,t,e){t=St(e,t),t=fi(l.stateNode,t,2),l=ie(l,t,2),l!==null&&(Na(l,2),Ct(l))}function hl(l,t,e){if(l.tag===3)Md(l,l,e);else for(;t!==null;){if(t.tag===3){Md(t,l,e);break}else if(t.tag===1){var a=t.stateNode;if(typeof t.type.getDerivedStateFromError=="function"||typeof a.componentDidCatch=="function"&&(ye===null||!ye.has(a))){l=St(e,l),e=x0(2),a=ie(t,e,2),a!==null&&(N0(e,a,t,l),Na(a,2),Ct(a));break}}t=t.return}}function Ri(l,t,e){var a=l.pingCache;if(a===null){a=l.pingCache=new fy;var u=new Set;a.set(t,u)}else u=a.get(t),u===void 0&&(u=new Set,a.set(t,u));u.has(e)||(_i=!0,u.add(e),l=yy.bind(null,l,t,e),t.then(l,l))}function yy(l,t,e){var a=l.pingCache;a!==null&&a.delete(t),l.pingedLanes|=l.suspendedLanes&e,l.warmLanes&=~e,zl===l&&(P&e)===e&&(Ml===4||Ml===3&&(P&62914560)===P&&300>nt()-vn?(sl&2)===0&&ba(l,0):Oi|=e,ga===P&&(ga=0)),Ct(l)}function xd(l,t){t===0&&(t=Tf()),l=Ne(l,t),l!==null&&(Na(l,t),Ct(l))}function ry(l){var t=l.memoizedState,e=0;t!==null&&(e=t.retryLane),xd(l,e)}function hy(l,t){var e=0;switch(l.tag){case 31:case 13:var a=l.stateNode,u=l.memoizedState;u!==null&&(e=u.retryLane);break;case 19:a=l.stateNode;break;case 22:a=l.stateNode._retryCache;break;default:throw Error(o(314))}a!==null&&a.delete(t),xd(l,e)}function vy(l,t){return Jn(l,t)}var An=null,Ea=null,ji=!1,pn=!1,Hi=!1,ve=0;function Ct(l){l!==Ea&&l.next===null&&(Ea===null?An=Ea=l:Ea=Ea.next=l),pn=!0,ji||(ji=!0,Sy())}function su(l,t){if(!Hi&&pn){Hi=!0;do for(var e=!1,a=An;a!==null;){if(l!==0){var u=a.pendingLanes;if(u===0)var n=0;else{var c=a.suspendedLanes,i=a.pingedLanes;n=(1<<31-it(42|l)+1)-1,n&=u&~(c&~i),n=n&201326741?n&201326741|1:n?n|2:0}n!==0&&(e=!0,Cd(a,n))}else n=P,n=Mu(a,a===zl?n:0,a.cancelPendingCommit!==null||a.timeoutHandle!==-1),(n&3)===0||xa(a,n)||(e=!0,Cd(a,n));a=a.next}while(e);Hi=!1}}function gy(){Nd()}function Nd(){pn=ji=!1;var l=0;ve!==0&&xy()&&(l=ve);for(var t=nt(),e=null,a=An;a!==null;){var u=a.next,n=Dd(a,t);n===0?(a.next=null,e===null?An=u:e.next=u,u===null&&(Ea=e)):(e=a,(l!==0||(n&3)!==0)&&(pn=!0)),a=u}Bl!==0&&Bl!==5||su(l),ve!==0&&(ve=0)}function Dd(l,t){for(var e=l.suspendedLanes,a=l.pingedLanes,u=l.expirationTimes,n=l.pendingLanes&-62914561;0<n;){var c=31-it(n),i=1<<c,f=u[c];f===-1?((i&e)===0||(i&a)!==0)&&(u[c]=Lo(i,t)):f<=t&&(l.expiredLanes|=i),n&=~i}if(t=zl,e=P,e=Mu(l,l===t?e:0,l.cancelPendingCommit!==null||l.timeoutHandle!==-1),a=l.callbackNode,e===0||l===t&&(rl===2||rl===9)||l.cancelPendingCommit!==null)return a!==null&&a!==null&&wn(a),l.callbackNode=null,l.callbackPriority=0;if((e&3)===0||xa(l,e)){if(t=e&-e,t===l.callbackPriority)return t;switch(a!==null&&wn(a),kn(e)){case 2:case 8:e=zf;break;case 32:e=Au;break;case 268435456:e=Ef;break;default:e=Au}return a=Ud.bind(null,l),e=Jn(e,a),l.callbackPriority=t,l.callbackNode=e,t}return a!==null&&a!==null&&wn(a),l.callbackPriority=2,l.callbackNode=null,2}function Ud(l,t){if(Bl!==0&&Bl!==5)return l.callbackNode=null,l.callbackPriority=0,null;var e=l.callbackNode;if(Tn()&&l.callbackNode!==e)return null;var a=P;return a=Mu(l,l===zl?a:0,l.cancelPendingCommit!==null||l.timeoutHandle!==-1),a===0?null:(md(l,a,t),Dd(l,nt()),l.callbackNode!=null&&l.callbackNode===e?Ud.bind(null,l):null)}function Cd(l,t){if(Tn())return null;md(l,t,!0)}function Sy(){Dy(function(){(sl&6)!==0?Jn(bf,gy):Nd()})}function Bi(){if(ve===0){var l=ca;l===0&&(l=pu,pu<<=1,(pu&261888)===0&&(pu=256)),ve=l}return ve}function Rd(l){return l==null||typeof l=="symbol"||typeof l=="boolean"?null:typeof l=="function"?l:Uu(""+l)}function jd(l,t){var e=t.ownerDocument.createElement("input");return e.name=t.name,e.value=t.value,l.id&&e.setAttribute("form",l.id),t.parentNode.insertBefore(e,t),l=new FormData(l),e.parentNode.removeChild(e),l}function by(l,t,e,a,u){if(t==="submit"&&e&&e.stateNode===u){var n=Rd((u[Fl]||null).action),c=a.submitter;c&&(t=(t=c[Fl]||null)?Rd(t.formAction):c.getAttribute("formAction"),t!==null&&(n=t,c=null));var i=new Hu("action","action",null,a,u);l.push({event:i,listeners:[{instance:null,listener:function(){if(a.defaultPrevented){if(ve!==0){var f=c?jd(u,c):new FormData(u);ei(e,{pending:!0,data:f,method:u.method,action:n},null,f)}}else typeof n=="function"&&(i.preventDefault(),f=c?jd(u,c):new FormData(u),ei(e,{pending:!0,data:f,method:u.method,action:n},n,f))},currentTarget:u}]})}}for(var qi=0;qi<bc.length;qi++){var Yi=bc[qi],zy=Yi.toLowerCase(),Ey=Yi[0].toUpperCase()+Yi.slice(1);Ot(zy,"on"+Ey)}Ot(os,"onAnimationEnd"),Ot(ms,"onAnimationIteration"),Ot(ys,"onAnimationStart"),Ot("dblclick","onDoubleClick"),Ot("focusin","onFocus"),Ot("focusout","onBlur"),Ot(qm,"onTransitionRun"),Ot(Ym,"onTransitionStart"),Ot(Gm,"onTransitionCancel"),Ot(rs,"onTransitionEnd"),Je("onMouseEnter",["mouseout","mouseover"]),Je("onMouseLeave",["mouseout","mouseover"]),Je("onPointerEnter",["pointerout","pointerover"]),Je("onPointerLeave",["pointerout","pointerover"]),_e("onChange","change click focusin focusout input keydown keyup selectionchange".split(" ")),_e("onSelect","focusout contextmenu dragend focusin keydown keyup mousedown mouseup selectionchange".split(" ")),_e("onBeforeInput",["compositionend","keypress","textInput","paste"]),_e("onCompositionEnd","compositionend focusout keydown keypress keyup mousedown".split(" ")),_e("onCompositionStart","compositionstart focusout keydown keypress keyup mousedown".split(" ")),_e("onCompositionUpdate","compositionupdate focusout keydown keypress keyup mousedown".split(" "));var du="abort canplay canplaythrough durationchange emptied encrypted ended error loadeddata loadedmetadata loadstart pause play playing progress ratechange resize seeked seeking stalled suspend timeupdate volumechange waiting".split(" "),Ty=new Set("beforetoggle cancel close invalid load scroll scrollend toggle".split(" ").concat(du));function Hd(l,t){t=(t&4)!==0;for(var e=0;e<l.length;e++){var a=l[e],u=a.event;a=a.listeners;l:{var n=void 0;if(t)for(var c=a.length-1;0<=c;c--){var i=a[c],f=i.instance,r=i.currentTarget;if(i=i.listener,f!==n&&u.isPropagationStopped())break l;n=i,u.currentTarget=r;try{n(u)}catch(b){Yu(b)}u.currentTarget=null,n=f}else for(c=0;c<a.length;c++){if(i=a[c],f=i.instance,r=i.currentTarget,i=i.listener,f!==n&&u.isPropagationStopped())break l;n=i,u.currentTarget=r;try{n(u)}catch(b){Yu(b)}u.currentTarget=null,n=f}}}}function F(l,t){var e=t[Fn];e===void 0&&(e=t[Fn]=new Set);var a=l+"__bubble";e.has(a)||(Bd(t,l,2,!1),e.add(a))}function Gi(l,t,e){var a=0;t&&(a|=4),Bd(e,l,a,t)}var _n="_reactListening"+Math.random().toString(36).slice(2);function Qi(l){if(!l[_n]){l[_n]=!0,Nf.forEach(function(e){e!=="selectionchange"&&(Ty.has(e)||Gi(e,!1,l),Gi(e,!0,l))});var t=l.nodeType===9?l:l.ownerDocument;t===null||t[_n]||(t[_n]=!0,Gi("selectionchange",!1,t))}}function Bd(l,t,e,a){switch(mo(t)){case 2:var u=ky;break;case 8:u=Fy;break;default:u=tf}e=u.bind(null,t,e,l),u=void 0,!cc||t!=="touchstart"&&t!=="touchmove"&&t!=="wheel"||(u=!0),a?u!==void 0?l.addEventListener(t,e,{capture:!0,passive:u}):l.addEventListener(t,e,!0):u!==void 0?l.addEventListener(t,e,{passive:u}):l.addEventListener(t,e,!1)}function Xi(l,t,e,a,u){var n=a;if((t&1)===0&&(t&2)===0&&a!==null)l:for(;;){if(a===null)return;var c=a.tag;if(c===3||c===4){var i=a.stateNode.containerInfo;if(i===u)break;if(c===4)for(c=a.return;c!==null;){var f=c.tag;if((f===3||f===4)&&c.stateNode.containerInfo===u)return;c=c.return}for(;i!==null;){if(c=Ve(i),c===null)return;if(f=c.tag,f===5||f===6||f===26||f===27){a=n=c;continue l}i=i.parentNode}}a=a.return}Xf(function(){var r=n,b=uc(e),A=[];l:{var h=hs.get(l);if(h!==void 0){var S=Hu,j=l;switch(l){case"keypress":if(Ru(e)===0)break l;case"keydown":case"keyup":S=hm;break;case"focusin":j="focus",S=dc;break;case"focusout":j="blur",S=dc;break;case"beforeblur":case"afterblur":S=dc;break;case"click":if(e.button===2)break l;case"auxclick":case"dblclick":case"mousedown":case"mousemove":case"mouseup":case"mouseout":case"mouseover":case"contextmenu":S=Lf;break;case"drag":case"dragend":case"dragenter":case"dragexit":case"dragleave":case"dragover":case"dragstart":case"drop":S=am;break;case"touchcancel":case"touchend":case"touchmove":case"touchstart":S=Sm;break;case os:case ms:case ys:S=cm;break;case rs:S=zm;break;case"scroll":case"scrollend":S=tm;break;case"wheel":S=Tm;break;case"copy":case"cut":case"paste":S=fm;break;case"gotpointercapture":case"lostpointercapture":case"pointercancel":case"pointerdown":case"pointermove":case"pointerout":case"pointerover":case"pointerup":S=Jf;break;case"toggle":case"beforetoggle":S=pm}var Z=(t&4)!==0,bl=!Z&&(l==="scroll"||l==="scrollend"),m=Z?h!==null?h+"Capture":null:h;Z=[];for(var s=r,y;s!==null;){var T=s;if(y=T.stateNode,T=T.tag,T!==5&&T!==26&&T!==27||y===null||m===null||(T=Ca(s,m),T!=null&&Z.push(ou(s,T,y))),bl)break;s=s.return}0<Z.length&&(h=new S(h,j,null,e,b),A.push({event:h,listeners:Z}))}}if((t&7)===0){l:{if(h=l==="mouseover"||l==="pointerover",S=l==="mouseout"||l==="pointerout",h&&e!==ac&&(j=e.relatedTarget||e.fromElement)&&(Ve(j)||j[Ze]))break l;if((S||h)&&(h=b.window===b?b:(h=b.ownerDocument)?h.defaultView||h.parentWindow:window,S?(j=e.relatedTarget||e.toElement,S=r,j=j?Ve(j):null,j!==null&&(bl=Y(j),Z=j.tag,j!==bl||Z!==5&&Z!==27&&Z!==6)&&(j=null)):(S=null,j=r),S!==j)){if(Z=Lf,T="onMouseLeave",m="onMouseEnter",s="mouse",(l==="pointerout"||l==="pointerover")&&(Z=Jf,T="onPointerLeave",m="onPointerEnter",s="pointer"),bl=S==null?h:Ua(S),y=j==null?h:Ua(j),h=new Z(T,s+"leave",S,e,b),h.target=bl,h.relatedTarget=y,T=null,Ve(b)===r&&(Z=new Z(m,s+"enter",j,e,b),Z.target=y,Z.relatedTarget=bl,T=Z),bl=T,S&&j)t:{for(Z=Ay,m=S,s=j,y=0,T=m;T;T=Z(T))y++;T=0;for(var Q=s;Q;Q=Z(Q))T++;for(;0<y-T;)m=Z(m),y--;for(;0<T-y;)s=Z(s),T--;for(;y--;){if(m===s||s!==null&&m===s.alternate){Z=m;break t}m=Z(m),s=Z(s)}Z=null}else Z=null;S!==null&&qd(A,h,S,Z,!1),j!==null&&bl!==null&&qd(A,bl,j,Z,!0)}}l:{if(h=r?Ua(r):window,S=h.nodeName&&h.nodeName.toLowerCase(),S==="select"||S==="input"&&h.type==="file")var cl=ls;else if(If(h))if(ts)cl=jm;else{cl=Cm;var B=Um}else S=h.nodeName,!S||S.toLowerCase()!=="input"||h.type!=="checkbox"&&h.type!=="radio"?r&&ec(r.elementType)&&(cl=ls):cl=Rm;if(cl&&(cl=cl(l,r))){Pf(A,cl,e,b);break l}B&&B(l,h,r),l==="focusout"&&r&&h.type==="number"&&r.memoizedProps.value!=null&&tc(h,"number",h.value)}switch(B=r?Ua(r):window,l){case"focusin":(If(B)||B.contentEditable==="true")&&(Ie=B,vc=r,Qa=null);break;case"focusout":Qa=vc=Ie=null;break;case"mousedown":gc=!0;break;case"contextmenu":case"mouseup":case"dragend":gc=!1,ss(A,e,b);break;case"selectionchange":if(Bm)break;case"keydown":case"keyup":ss(A,e,b)}var w;if(mc)l:{switch(l){case"compositionstart":var ll="onCompositionStart";break l;case"compositionend":ll="onCompositionEnd";break l;case"compositionupdate":ll="onCompositionUpdate";break l}ll=void 0}else Fe?kf(l,e)&&(ll="onCompositionEnd"):l==="keydown"&&e.keyCode===229&&(ll="onCompositionStart");ll&&(wf&&e.locale!=="ko"&&(Fe||ll!=="onCompositionStart"?ll==="onCompositionEnd"&&Fe&&(w=Zf()):(le=b,ic="value"in le?le.value:le.textContent,Fe=!0)),B=On(r,ll),0<B.length&&(ll=new Kf(ll,l,null,e,b),A.push({event:ll,listeners:B}),w?ll.data=w:(w=Ff(e),w!==null&&(ll.data=w)))),(w=Om?Mm(l,e):xm(l,e))&&(ll=On(r,"onBeforeInput"),0<ll.length&&(B=new Kf("onBeforeInput","beforeinput",null,e,b),A.push({event:B,listeners:ll}),B.data=w)),by(A,l,r,e,b)}Hd(A,t)})}function ou(l,t,e){return{instance:l,listener:t,currentTarget:e}}function On(l,t){for(var e=t+"Capture",a=[];l!==null;){var u=l,n=u.stateNode;if(u=u.tag,u!==5&&u!==26&&u!==27||n===null||(u=Ca(l,e),u!=null&&a.unshift(ou(l,u,n)),u=Ca(l,t),u!=null&&a.push(ou(l,u,n))),l.tag===3)return a;l=l.return}return[]}function Ay(l){if(l===null)return null;do l=l.return;while(l&&l.tag!==5&&l.tag!==27);return l||null}function qd(l,t,e,a,u){for(var n=t._reactName,c=[];e!==null&&e!==a;){var i=e,f=i.alternate,r=i.stateNode;if(i=i.tag,f!==null&&f===a)break;i!==5&&i!==26&&i!==27||r===null||(f=r,u?(r=Ca(e,n),r!=null&&c.unshift(ou(e,r,f))):u||(r=Ca(e,n),r!=null&&c.push(ou(e,r,f)))),e=e.return}c.length!==0&&l.push({event:t,listeners:c})}var py=/\r\n?/g,_y=/\u0000|\uFFFD/g;function Yd(l){return(typeof l=="string"?l:""+l).replace(py,`
-`).replace(_y,"")}function Gd(l,t){return t=Yd(t),Yd(l)===t}function Sl(l,t,e,a,u,n){switch(e){case"children":typeof a=="string"?t==="body"||t==="textarea"&&a===""||We(l,a):(typeof a=="number"||typeof a=="bigint")&&t!=="body"&&We(l,""+a);break;case"className":Nu(l,"class",a);break;case"tabIndex":Nu(l,"tabindex",a);break;case"dir":case"role":case"viewBox":case"width":case"height":Nu(l,e,a);break;case"style":Gf(l,a,n);break;case"data":if(t!=="object"){Nu(l,"data",a);break}case"src":case"href":if(a===""&&(t!=="a"||e!=="href")){l.removeAttribute(e);break}if(a==null||typeof a=="function"||typeof a=="symbol"||typeof a=="boolean"){l.removeAttribute(e);break}a=Uu(""+a),l.setAttribute(e,a);break;case"action":case"formAction":if(typeof a=="function"){l.setAttribute(e,"javascript:throw new Error('A React form was unexpectedly submitted. If you called form.submit() manually, consider using form.requestSubmit() instead. If you\\'re trying to use event.stopPropagation() in a submit event handler, consider also calling event.preventDefault().')");break}else typeof n=="function"&&(e==="formAction"?(t!=="input"&&Sl(l,t,"name",u.name,u,null),Sl(l,t,"formEncType",u.formEncType,u,null),Sl(l,t,"formMethod",u.formMethod,u,null),Sl(l,t,"formTarget",u.formTarget,u,null)):(Sl(l,t,"encType",u.encType,u,null),Sl(l,t,"method",u.method,u,null),Sl(l,t,"target",u.target,u,null)));if(a==null||typeof a=="symbol"||typeof a=="boolean"){l.removeAttribute(e);break}a=Uu(""+a),l.setAttribute(e,a);break;case"onClick":a!=null&&(l.onclick=Ht);break;case"onScroll":a!=null&&F("scroll",l);break;case"onScrollEnd":a!=null&&F("scrollend",l);break;case"dangerouslySetInnerHTML":if(a!=null){if(typeof a!="object"||!("__html"in a))throw Error(o(61));if(e=a.__html,e!=null){if(u.children!=null)throw Error(o(60));l.innerHTML=e}}break;case"multiple":l.multiple=a&&typeof a!="function"&&typeof a!="symbol";break;case"muted":l.muted=a&&typeof a!="function"&&typeof a!="symbol";break;case"suppressContentEditableWarning":case"suppressHydrationWarning":case"defaultValue":case"defaultChecked":case"innerHTML":case"ref":break;case"autoFocus":break;case"xlinkHref":if(a==null||typeof a=="function"||typeof a=="boolean"||typeof a=="symbol"){l.removeAttribute("xlink:href");break}e=Uu(""+a),l.setAttributeNS("http://www.w3.org/1999/xlink","xlink:href",e);break;case"contentEditable":case"spellCheck":case"draggable":case"value":case"autoReverse":case"externalResourcesRequired":case"focusable":case"preserveAlpha":a!=null&&typeof a!="function"&&typeof a!="symbol"?l.setAttribute(e,""+a):l.removeAttribute(e);break;case"inert":case"allowFullScreen":case"async":case"autoPlay":case"controls":case"default":case"defer":case"disabled":case"disablePictureInPicture":case"disableRemotePlayback":case"formNoValidate":case"hidden":case"loop":case"noModule":case"noValidate":case"open":case"playsInline":case"readOnly":case"required":case"reversed":case"scoped":case"seamless":case"itemScope":a&&typeof a!="function"&&typeof a!="symbol"?l.setAttribute(e,""):l.removeAttribute(e);break;case"capture":case"download":a===!0?l.setAttribute(e,""):a!==!1&&a!=null&&typeof a!="function"&&typeof a!="symbol"?l.setAttribute(e,a):l.removeAttribute(e);break;case"cols":case"rows":case"size":case"span":a!=null&&typeof a!="function"&&typeof a!="symbol"&&!isNaN(a)&&1<=a?l.setAttribute(e,a):l.removeAttribute(e);break;case"rowSpan":case"start":a==null||typeof a=="function"||typeof a=="symbol"||isNaN(a)?l.removeAttribute(e):l.setAttribute(e,a);break;case"popover":F("beforetoggle",l),F("toggle",l),xu(l,"popover",a);break;case"xlinkActuate":jt(l,"http://www.w3.org/1999/xlink","xlink:actuate",a);break;case"xlinkArcrole":jt(l,"http://www.w3.org/1999/xlink","xlink:arcrole",a);break;case"xlinkRole":jt(l,"http://www.w3.org/1999/xlink","xlink:role",a);break;case"xlinkShow":jt(l,"http://www.w3.org/1999/xlink","xlink:show",a);break;case"xlinkTitle":jt(l,"http://www.w3.org/1999/xlink","xlink:title",a);break;case"xlinkType":jt(l,"http://www.w3.org/1999/xlink","xlink:type",a);break;case"xmlBase":jt(l,"http://www.w3.org/XML/1998/namespace","xml:base",a);break;case"xmlLang":jt(l,"http://www.w3.org/XML/1998/namespace","xml:lang",a);break;case"xmlSpace":jt(l,"http://www.w3.org/XML/1998/namespace","xml:space",a);break;case"is":xu(l,"is",a);break;case"innerText":case"textContent":break;default:(!(2<e.length)||e[0]!=="o"&&e[0]!=="O"||e[1]!=="n"&&e[1]!=="N")&&(e=Po.get(e)||e,xu(l,e,a))}}function Zi(l,t,e,a,u,n){switch(e){case"style":Gf(l,a,n);break;case"dangerouslySetInnerHTML":if(a!=null){if(typeof a!="object"||!("__html"in a))throw Error(o(61));if(e=a.__html,e!=null){if(u.children!=null)throw Error(o(60));l.innerHTML=e}}break;case"children":typeof a=="string"?We(l,a):(typeof a=="number"||typeof a=="bigint")&&We(l,""+a);break;case"onScroll":a!=null&&F("scroll",l);break;case"onScrollEnd":a!=null&&F("scrollend",l);break;case"onClick":a!=null&&(l.onclick=Ht);break;case"suppressContentEditableWarning":case"suppressHydrationWarning":case"innerHTML":case"ref":break;case"innerText":case"textContent":break;default:if(!Df.hasOwnProperty(e))l:{if(e[0]==="o"&&e[1]==="n"&&(u=e.endsWith("Capture"),t=e.slice(2,u?e.length-7:void 0),n=l[Fl]||null,n=n!=null?n[e]:null,typeof n=="function"&&l.removeEventListener(t,n,u),typeof a=="function")){typeof n!="function"&&n!==null&&(e in l?l[e]=null:l.hasAttribute(e)&&l.removeAttribute(e)),l.addEventListener(t,a,u);break l}e in l?l[e]=a:a===!0?l.setAttribute(e,""):xu(l,e,a)}}}function Ll(l,t,e){switch(t){case"div":case"span":case"svg":case"path":case"a":case"g":case"p":case"li":break;case"img":F("error",l),F("load",l);var a=!1,u=!1,n;for(n in e)if(e.hasOwnProperty(n)){var c=e[n];if(c!=null)switch(n){case"src":a=!0;break;case"srcSet":u=!0;break;case"children":case"dangerouslySetInnerHTML":throw Error(o(137,t));default:Sl(l,t,n,c,e,null)}}u&&Sl(l,t,"srcSet",e.srcSet,e,null),a&&Sl(l,t,"src",e.src,e,null);return;case"input":F("invalid",l);var i=n=c=u=null,f=null,r=null;for(a in e)if(e.hasOwnProperty(a)){var b=e[a];if(b!=null)switch(a){case"name":u=b;break;case"type":c=b;break;case"checked":f=b;break;case"defaultChecked":r=b;break;case"value":n=b;break;case"defaultValue":i=b;break;case"children":case"dangerouslySetInnerHTML":if(b!=null)throw Error(o(137,t));break;default:Sl(l,t,a,b,e,null)}}Hf(l,n,i,f,r,c,u,!1);return;case"select":F("invalid",l),a=c=n=null;for(u in e)if(e.hasOwnProperty(u)&&(i=e[u],i!=null))switch(u){case"value":n=i;break;case"defaultValue":c=i;break;case"multiple":a=i;default:Sl(l,t,u,i,e,null)}t=n,e=c,l.multiple=!!a,t!=null?we(l,!!a,t,!1):e!=null&&we(l,!!a,e,!0);return;case"textarea":F("invalid",l),n=u=a=null;for(c in e)if(e.hasOwnProperty(c)&&(i=e[c],i!=null))switch(c){case"value":a=i;break;case"defaultValue":u=i;break;case"children":n=i;break;case"dangerouslySetInnerHTML":if(i!=null)throw Error(o(91));break;default:Sl(l,t,c,i,e,null)}qf(l,a,u,n);return;case"option":for(f in e)e.hasOwnProperty(f)&&(a=e[f],a!=null)&&(f==="selected"?l.selected=a&&typeof a!="function"&&typeof a!="symbol":Sl(l,t,f,a,e,null));return;case"dialog":F("beforetoggle",l),F("toggle",l),F("cancel",l),F("close",l);break;case"iframe":case"object":F("load",l);break;case"video":case"audio":for(a=0;a<du.length;a++)F(du[a],l);break;case"image":F("error",l),F("load",l);break;case"details":F("toggle",l);break;case"embed":case"source":case"link":F("error",l),F("load",l);case"area":case"base":case"br":case"col":case"hr":case"keygen":case"meta":case"param":case"track":case"wbr":case"menuitem":for(r in e)if(e.hasOwnProperty(r)&&(a=e[r],a!=null))switch(r){case"children":case"dangerouslySetInnerHTML":throw Error(o(137,t));default:Sl(l,t,r,a,e,null)}return;default:if(ec(t)){for(b in e)e.hasOwnProperty(b)&&(a=e[b],a!==void 0&&Zi(l,t,b,a,e,void 0));return}}for(i in e)e.hasOwnProperty(i)&&(a=e[i],a!=null&&Sl(l,t,i,a,e,null))}function Oy(l,t,e,a){switch(t){case"div":case"span":case"svg":case"path":case"a":case"g":case"p":case"li":break;case"input":var u=null,n=null,c=null,i=null,f=null,r=null,b=null;for(S in e){var A=e[S];if(e.hasOwnProperty(S)&&A!=null)switch(S){case"checked":break;case"value":break;case"defaultValue":f=A;default:a.hasOwnProperty(S)||Sl(l,t,S,null,a,A)}}for(var h in a){var S=a[h];if(A=e[h],a.hasOwnProperty(h)&&(S!=null||A!=null))switch(h){case"type":n=S;break;case"name":u=S;break;case"checked":r=S;break;case"defaultChecked":b=S;break;case"value":c=S;break;case"defaultValue":i=S;break;case"children":case"dangerouslySetInnerHTML":if(S!=null)throw Error(o(137,t));break;default:S!==A&&Sl(l,t,h,S,a,A)}}lc(l,c,i,f,r,b,n,u);return;case"select":S=c=i=h=null;for(n in e)if(f=e[n],e.hasOwnProperty(n)&&f!=null)switch(n){case"value":break;case"multiple":S=f;default:a.hasOwnProperty(n)||Sl(l,t,n,null,a,f)}for(u in a)if(n=a[u],f=e[u],a.hasOwnProperty(u)&&(n!=null||f!=null))switch(u){case"value":h=n;break;case"defaultValue":i=n;break;case"multiple":c=n;default:n!==f&&Sl(l,t,u,n,a,f)}t=i,e=c,a=S,h!=null?we(l,!!e,h,!1):!!a!=!!e&&(t!=null?we(l,!!e,t,!0):we(l,!!e,e?[]:"",!1));return;case"textarea":S=h=null;for(i in e)if(u=e[i],e.hasOwnProperty(i)&&u!=null&&!a.hasOwnProperty(i))switch(i){case"value":break;case"children":break;default:Sl(l,t,i,null,a,u)}for(c in a)if(u=a[c],n=e[c],a.hasOwnProperty(c)&&(u!=null||n!=null))switch(c){case"value":h=u;break;case"defaultValue":S=u;break;case"children":break;case"dangerouslySetInnerHTML":if(u!=null)throw Error(o(91));break;default:u!==n&&Sl(l,t,c,u,a,n)}Bf(l,h,S);return;case"option":for(var j in e)h=e[j],e.hasOwnProperty(j)&&h!=null&&!a.hasOwnProperty(j)&&(j==="selected"?l.selected=!1:Sl(l,t,j,null,a,h));for(f in a)h=a[f],S=e[f],a.hasOwnProperty(f)&&h!==S&&(h!=null||S!=null)&&(f==="selected"?l.selected=h&&typeof h!="function"&&typeof h!="symbol":Sl(l,t,f,h,a,S));return;case"img":case"link":case"area":case"base":case"br":case"col":case"embed":case"hr":case"keygen":case"meta":case"param":case"source":case"track":case"wbr":case"menuitem":for(var Z in e)h=e[Z],e.hasOwnProperty(Z)&&h!=null&&!a.hasOwnProperty(Z)&&Sl(l,t,Z,null,a,h);for(r in a)if(h=a[r],S=e[r],a.hasOwnProperty(r)&&h!==S&&(h!=null||S!=null))switch(r){case"children":case"dangerouslySetInnerHTML":if(h!=null)throw Error(o(137,t));break;default:Sl(l,t,r,h,a,S)}return;default:if(ec(t)){for(var bl in e)h=e[bl],e.hasOwnProperty(bl)&&h!==void 0&&!a.hasOwnProperty(bl)&&Zi(l,t,bl,void 0,a,h);for(b in a)h=a[b],S=e[b],!a.hasOwnProperty(b)||h===S||h===void 0&&S===void 0||Zi(l,t,b,h,a,S);return}}for(var m in e)h=e[m],e.hasOwnProperty(m)&&h!=null&&!a.hasOwnProperty(m)&&Sl(l,t,m,null,a,h);for(A in a)h=a[A],S=e[A],!a.hasOwnProperty(A)||h===S||h==null&&S==null||Sl(l,t,A,h,a,S)}function Qd(l){switch(l){case"css":case"script":case"font":case"img":case"image":case"input":case"link":return!0;default:return!1}}function My(){if(typeof performance.getEntriesByType=="function"){for(var l=0,t=0,e=performance.getEntriesByType("resource"),a=0;a<e.length;a++){var u=e[a],n=u.transferSize,c=u.initiatorType,i=u.duration;if(n&&i&&Qd(c)){for(c=0,i=u.responseEnd,a+=1;a<e.length;a++){var f=e[a],r=f.startTime;if(r>i)break;var b=f.transferSize,A=f.initiatorType;b&&Qd(A)&&(f=f.responseEnd,c+=b*(f<i?1:(i-r)/(f-r)))}if(--a,t+=8*(n+c)/(u.duration/1e3),l++,10<l)break}}if(0<l)return t/l/1e6}return navigator.connection&&(l=navigator.connection.downlink,typeof l=="number")?l:5}var Vi=null,Li=null;function Mn(l){return l.nodeType===9?l:l.ownerDocument}function Xd(l){switch(l){case"http://www.w3.org/2000/svg":return 1;case"http://www.w3.org/1998/Math/MathML":return 2;default:return 0}}function Zd(l,t){if(l===0)switch(t){case"svg":return 1;case"math":return 2;default:return 0}return l===1&&t==="foreignObject"?0:l}function Ki(l,t){return l==="textarea"||l==="noscript"||typeof t.children=="string"||typeof t.children=="number"||typeof t.children=="bigint"||typeof t.dangerouslySetInnerHTML=="object"&&t.dangerouslySetInnerHTML!==null&&t.dangerouslySetInnerHTML.__html!=null}var Ji=null;function xy(){var l=window.event;return l&&l.type==="popstate"?l===Ji?!1:(Ji=l,!0):(Ji=null,!1)}var Vd=typeof setTimeout=="function"?setTimeout:void 0,Ny=typeof clearTimeout=="function"?clearTimeout:void 0,Ld=typeof Promise=="function"?Promise:void 0,Dy=typeof queueMicrotask=="function"?queueMicrotask:typeof Ld<"u"?function(l){return Ld.resolve(null).then(l).catch(Uy)}:Vd;function Uy(l){setTimeout(function(){throw l})}function ge(l){return l==="head"}function Kd(l,t){var e=t,a=0;do{var u=e.nextSibling;if(l.removeChild(e),u&&u.nodeType===8)if(e=u.data,e==="/$"||e==="/&"){if(a===0){l.removeChild(u),_a(t);return}a--}else if(e==="$"||e==="$?"||e==="$~"||e==="$!"||e==="&")a++;else if(e==="html")mu(l.ownerDocument.documentElement);else if(e==="head"){e=l.ownerDocument.head,mu(e);for(var n=e.firstChild;n;){var c=n.nextSibling,i=n.nodeName;n[Da]||i==="SCRIPT"||i==="STYLE"||i==="LINK"&&n.rel.toLowerCase()==="stylesheet"||e.removeChild(n),n=c}}else e==="body"&&mu(l.ownerDocument.body);e=u}while(e);_a(t)}function Jd(l,t){var e=l;l=0;do{var a=e.nextSibling;if(e.nodeType===1?t?(e._stashedDisplay=e.style.display,e.style.display="none"):(e.style.display=e._stashedDisplay||"",e.getAttribute("style")===""&&e.removeAttribute("style")):e.nodeType===3&&(t?(e._stashedText=e.nodeValue,e.nodeValue=""):e.nodeValue=e._stashedText||""),a&&a.nodeType===8)if(e=a.data,e==="/$"){if(l===0)break;l--}else e!=="$"&&e!=="$?"&&e!=="$~"&&e!=="$!"||l++;e=a}while(e)}function wi(l){var t=l.firstChild;for(t&&t.nodeType===10&&(t=t.nextSibling);t;){var e=t;switch(t=t.nextSibling,e.nodeName){case"HTML":case"HEAD":case"BODY":wi(e),In(e);continue;case"SCRIPT":case"STYLE":continue;case"LINK":if(e.rel.toLowerCase()==="stylesheet")continue}l.removeChild(e)}}function Cy(l,t,e,a){for(;l.nodeType===1;){var u=e;if(l.nodeName.toLowerCase()!==t.toLowerCase()){if(!a&&(l.nodeName!=="INPUT"||l.type!=="hidden"))break}else if(a){if(!l[Da])switch(t){case"meta":if(!l.hasAttribute("itemprop"))break;return l;case"link":if(n=l.getAttribute("rel"),n==="stylesheet"&&l.hasAttribute("data-precedence"))break;if(n!==u.rel||l.getAttribute("href")!==(u.href==null||u.href===""?null:u.href)||l.getAttribute("crossorigin")!==(u.crossOrigin==null?null:u.crossOrigin)||l.getAttribute("title")!==(u.title==null?null:u.title))break;return l;case"style":if(l.hasAttribute("data-precedence"))break;return l;case"script":if(n=l.getAttribute("src"),(n!==(u.src==null?null:u.src)||l.getAttribute("type")!==(u.type==null?null:u.type)||l.getAttribute("crossorigin")!==(u.crossOrigin==null?null:u.crossOrigin))&&n&&l.hasAttribute("async")&&!l.hasAttribute("itemprop"))break;return l;default:return l}}else if(t==="input"&&l.type==="hidden"){var n=u.name==null?null:""+u.name;if(u.type==="hidden"&&l.getAttribute("name")===n)return l}else return l;if(l=At(l.nextSibling),l===null)break}return null}function Ry(l,t,e){if(t==="")return null;for(;l.nodeType!==3;)if((l.nodeType!==1||l.nodeName!=="INPUT"||l.type!=="hidden")&&!e||(l=At(l.nextSibling),l===null))return null;return l}function wd(l,t){for(;l.nodeType!==8;)if((l.nodeType!==1||l.nodeName!=="INPUT"||l.type!=="hidden")&&!t||(l=At(l.nextSibling),l===null))return null;return l}function Wi(l){return l.data==="$?"||l.data==="$~"}function $i(l){return l.data==="$!"||l.data==="$?"&&l.ownerDocument.readyState!=="loading"}function jy(l,t){var e=l.ownerDocument;if(l.data==="$~")l._reactRetry=t;else if(l.data!=="$?"||e.readyState!=="loading")t();else{var a=function(){t(),e.removeEventListener("DOMContentLoaded",a)};e.addEventListener("DOMContentLoaded",a),l._reactRetry=a}}function At(l){for(;l!=null;l=l.nextSibling){var t=l.nodeType;if(t===1||t===3)break;if(t===8){if(t=l.data,t==="$"||t==="$!"||t==="$?"||t==="$~"||t==="&"||t==="F!"||t==="F")break;if(t==="/$"||t==="/&")return null}}return l}var ki=null;function Wd(l){l=l.nextSibling;for(var t=0;l;){if(l.nodeType===8){var e=l.data;if(e==="/$"||e==="/&"){if(t===0)return At(l.nextSibling);t--}else e!=="$"&&e!=="$!"&&e!=="$?"&&e!=="$~"&&e!=="&"||t++}l=l.nextSibling}return null}function $d(l){l=l.previousSibling;for(var t=0;l;){if(l.nodeType===8){var e=l.data;if(e==="$"||e==="$!"||e==="$?"||e==="$~"||e==="&"){if(t===0)return l;t--}else e!=="/$"&&e!=="/&"||t++}l=l.previousSibling}return null}function kd(l,t,e){switch(t=Mn(e),l){case"html":if(l=t.documentElement,!l)throw Error(o(452));return l;case"head":if(l=t.head,!l)throw Error(o(453));return l;case"body":if(l=t.body,!l)throw Error(o(454));return l;default:throw Error(o(451))}}function mu(l){for(var t=l.attributes;t.length;)l.removeAttributeNode(t[0]);In(l)}var pt=new Map,Fd=new Set;function xn(l){return typeof l.getRootNode=="function"?l.getRootNode():l.nodeType===9?l:l.ownerDocument}var Ft=N.d;N.d={f:Hy,r:By,D:qy,C:Yy,L:Gy,m:Qy,X:Zy,S:Xy,M:Vy};function Hy(){var l=Ft.f(),t=bn();return l||t}function By(l){var t=Le(l);t!==null&&t.tag===5&&t.type==="form"?r0(t):Ft.r(l)}var Ta=typeof document>"u"?null:document;function Id(l,t,e){var a=Ta;if(a&&typeof t=="string"&&t){var u=vt(t);u='link[rel="'+l+'"][href="'+u+'"]',typeof e=="string"&&(u+='[crossorigin="'+e+'"]'),Fd.has(u)||(Fd.add(u),l={rel:l,crossOrigin:e,href:t},a.querySelector(u)===null&&(t=a.createElement("link"),Ll(t,"link",l),ql(t),a.head.appendChild(t)))}}function qy(l){Ft.D(l),Id("dns-prefetch",l,null)}function Yy(l,t){Ft.C(l,t),Id("preconnect",l,t)}function Gy(l,t,e){Ft.L(l,t,e);var a=Ta;if(a&&l&&t){var u='link[rel="preload"][as="'+vt(t)+'"]';t==="image"&&e&&e.imageSrcSet?(u+='[imagesrcset="'+vt(e.imageSrcSet)+'"]',typeof e.imageSizes=="string"&&(u+='[imagesizes="'+vt(e.imageSizes)+'"]')):u+='[href="'+vt(l)+'"]';var n=u;switch(t){case"style":n=Aa(l);break;case"script":n=pa(l)}pt.has(n)||(l=U({rel:"preload",href:t==="image"&&e&&e.imageSrcSet?void 0:l,as:t},e),pt.set(n,l),a.querySelector(u)!==null||t==="style"&&a.querySelector(yu(n))||t==="script"&&a.querySelector(ru(n))||(t=a.createElement("link"),Ll(t,"link",l),ql(t),a.head.appendChild(t)))}}function Qy(l,t){Ft.m(l,t);var e=Ta;if(e&&l){var a=t&&typeof t.as=="string"?t.as:"script",u='link[rel="modulepreload"][as="'+vt(a)+'"][href="'+vt(l)+'"]',n=u;switch(a){case"audioworklet":case"paintworklet":case"serviceworker":case"sharedworker":case"worker":case"script":n=pa(l)}if(!pt.has(n)&&(l=U({rel:"modulepreload",href:l},t),pt.set(n,l),e.querySelector(u)===null)){switch(a){case"audioworklet":case"paintworklet":case"serviceworker":case"sharedworker":case"worker":case"script":if(e.querySelector(ru(n)))return}a=e.createElement("link"),Ll(a,"link",l),ql(a),e.head.appendChild(a)}}}function Xy(l,t,e){Ft.S(l,t,e);var a=Ta;if(a&&l){var u=Ke(a).hoistableStyles,n=Aa(l);t=t||"default";var c=u.get(n);if(!c){var i={loading:0,preload:null};if(c=a.querySelector(yu(n)))i.loading=5;else{l=U({rel:"stylesheet",href:l,"data-precedence":t},e),(e=pt.get(n))&&Fi(l,e);var f=c=a.createElement("link");ql(f),Ll(f,"link",l),f._p=new Promise(function(r,b){f.onload=r,f.onerror=b}),f.addEventListener("load",function(){i.loading|=1}),f.addEventListener("error",function(){i.loading|=2}),i.loading|=4,Nn(c,t,a)}c={type:"stylesheet",instance:c,count:1,state:i},u.set(n,c)}}}function Zy(l,t){Ft.X(l,t);var e=Ta;if(e&&l){var a=Ke(e).hoistableScripts,u=pa(l),n=a.get(u);n||(n=e.querySelector(ru(u)),n||(l=U({src:l,async:!0},t),(t=pt.get(u))&&Ii(l,t),n=e.createElement("script"),ql(n),Ll(n,"link",l),e.head.appendChild(n)),n={type:"script",instance:n,count:1,state:null},a.set(u,n))}}function Vy(l,t){Ft.M(l,t);var e=Ta;if(e&&l){var a=Ke(e).hoistableScripts,u=pa(l),n=a.get(u);n||(n=e.querySelector(ru(u)),n||(l=U({src:l,async:!0,type:"module"},t),(t=pt.get(u))&&Ii(l,t),n=e.createElement("script"),ql(n),Ll(n,"link",l),e.head.appendChild(n)),n={type:"script",instance:n,count:1,state:null},a.set(u,n))}}function Pd(l,t,e,a){var u=(u=K.current)?xn(u):null;if(!u)throw Error(o(446));switch(l){case"meta":case"title":return null;case"style":return typeof e.precedence=="string"&&typeof e.href=="string"?(t=Aa(e.href),e=Ke(u).hoistableStyles,a=e.get(t),a||(a={type:"style",instance:null,count:0,state:null},e.set(t,a)),a):{type:"void",instance:null,count:0,state:null};case"link":if(e.rel==="stylesheet"&&typeof e.href=="string"&&typeof e.precedence=="string"){l=Aa(e.href);var n=Ke(u).hoistableStyles,c=n.get(l);if(c||(u=u.ownerDocument||u,c={type:"stylesheet",instance:null,count:0,state:{loading:0,preload:null}},n.set(l,c),(n=u.querySelector(yu(l)))&&!n._p&&(c.instance=n,c.state.loading=5),pt.has(l)||(e={rel:"preload",as:"style",href:e.href,crossOrigin:e.crossOrigin,integrity:e.integrity,media:e.media,hrefLang:e.hrefLang,referrerPolicy:e.referrerPolicy},pt.set(l,e),n||Ly(u,l,e,c.state))),t&&a===null)throw Error(o(528,""));return c}if(t&&a!==null)throw Error(o(529,""));return null;case"script":return t=e.async,e=e.src,typeof e=="string"&&t&&typeof t!="function"&&typeof t!="symbol"?(t=pa(e),e=Ke(u).hoistableScripts,a=e.get(t),a||(a={type:"script",instance:null,count:0,state:null},e.set(t,a)),a):{type:"void",instance:null,count:0,state:null};default:throw Error(o(444,l))}}function Aa(l){return'href="'+vt(l)+'"'}function yu(l){return'link[rel="stylesheet"]['+l+"]"}function lo(l){return U({},l,{"data-precedence":l.precedence,precedence:null})}function Ly(l,t,e,a){l.querySelector('link[rel="preload"][as="style"]['+t+"]")?a.loading=1:(t=l.createElement("link"),a.preload=t,t.addEventListener("load",function(){return a.loading|=1}),t.addEventListener("error",function(){return a.loading|=2}),Ll(t,"link",e),ql(t),l.head.appendChild(t))}function pa(l){return'[src="'+vt(l)+'"]'}function ru(l){return"script[async]"+l}function to(l,t,e){if(t.count++,t.instance===null)switch(t.type){case"style":var a=l.querySelector('style[data-href~="'+vt(e.href)+'"]');if(a)return t.instance=a,ql(a),a;var u=U({},e,{"data-href":e.href,"data-precedence":e.precedence,href:null,precedence:null});return a=(l.ownerDocument||l).createElement("style"),ql(a),Ll(a,"style",u),Nn(a,e.precedence,l),t.instance=a;case"stylesheet":u=Aa(e.href);var n=l.querySelector(yu(u));if(n)return t.state.loading|=4,t.instance=n,ql(n),n;a=lo(e),(u=pt.get(u))&&Fi(a,u),n=(l.ownerDocument||l).createElement("link"),ql(n);var c=n;return c._p=new Promise(function(i,f){c.onload=i,c.onerror=f}),Ll(n,"link",a),t.state.loading|=4,Nn(n,e.precedence,l),t.instance=n;case"script":return n=pa(e.src),(u=l.querySelector(ru(n)))?(t.instance=u,ql(u),u):(a=e,(u=pt.get(n))&&(a=U({},e),Ii(a,u)),l=l.ownerDocument||l,u=l.createElement("script"),ql(u),Ll(u,"link",a),l.head.appendChild(u),t.instance=u);case"void":return null;default:throw Error(o(443,t.type))}else t.type==="stylesheet"&&(t.state.loading&4)===0&&(a=t.instance,t.state.loading|=4,Nn(a,e.precedence,l));return t.instance}function Nn(l,t,e){for(var a=e.querySelectorAll('link[rel="stylesheet"][data-precedence],style[data-precedence]'),u=a.length?a[a.length-1]:null,n=u,c=0;c<a.length;c++){var i=a[c];if(i.dataset.precedence===t)n=i;else if(n!==u)break}n?n.parentNode.insertBefore(l,n.nextSibling):(t=e.nodeType===9?e.head:e,t.insertBefore(l,t.firstChild))}function Fi(l,t){l.crossOrigin==null&&(l.crossOrigin=t.crossOrigin),l.referrerPolicy==null&&(l.referrerPolicy=t.referrerPolicy),l.title==null&&(l.title=t.title)}function Ii(l,t){l.crossOrigin==null&&(l.crossOrigin=t.crossOrigin),l.referrerPolicy==null&&(l.referrerPolicy=t.referrerPolicy),l.integrity==null&&(l.integrity=t.integrity)}var Dn=null;function eo(l,t,e){if(Dn===null){var a=new Map,u=Dn=new Map;u.set(e,a)}else u=Dn,a=u.get(e),a||(a=new Map,u.set(e,a));if(a.has(l))return a;for(a.set(l,null),e=e.getElementsByTagName(l),u=0;u<e.length;u++){var n=e[u];if(!(n[Da]||n[Ql]||l==="link"&&n.getAttribute("rel")==="stylesheet")&&n.namespaceURI!=="http://www.w3.org/2000/svg"){var c=n.getAttribute(t)||"";c=l+c;var i=a.get(c);i?i.push(n):a.set(c,[n])}}return a}function ao(l,t,e){l=l.ownerDocument||l,l.head.insertBefore(e,t==="title"?l.querySelector("head > title"):null)}function Ky(l,t,e){if(e===1||t.itemProp!=null)return!1;switch(l){case"meta":case"title":return!0;case"style":if(typeof t.precedence!="string"||typeof t.href!="string"||t.href==="")break;return!0;case"link":if(typeof t.rel!="string"||typeof t.href!="string"||t.href===""||t.onLoad||t.onError)break;return t.rel==="stylesheet"?(l=t.disabled,typeof t.precedence=="string"&&l==null):!0;case"script":if(t.async&&typeof t.async!="function"&&typeof t.async!="symbol"&&!t.onLoad&&!t.onError&&t.src&&typeof t.src=="string")return!0}return!1}function uo(l){return!(l.type==="stylesheet"&&(l.state.loading&3)===0)}function Jy(l,t,e,a){if(e.type==="stylesheet"&&(typeof a.media!="string"||matchMedia(a.media).matches!==!1)&&(e.state.loading&4)===0){if(e.instance===null){var u=Aa(a.href),n=t.querySelector(yu(u));if(n){t=n._p,t!==null&&typeof t=="object"&&typeof t.then=="function"&&(l.count++,l=Un.bind(l),t.then(l,l)),e.state.loading|=4,e.instance=n,ql(n);return}n=t.ownerDocument||t,a=lo(a),(u=pt.get(u))&&Fi(a,u),n=n.createElement("link"),ql(n);var c=n;c._p=new Promise(function(i,f){c.onload=i,c.onerror=f}),Ll(n,"link",a),e.instance=n}l.stylesheets===null&&(l.stylesheets=new Map),l.stylesheets.set(e,t),(t=e.state.preload)&&(e.state.loading&3)===0&&(l.count++,e=Un.bind(l),t.addEventListener("load",e),t.addEventListener("error",e))}}var Pi=0;function wy(l,t){return l.stylesheets&&l.count===0&&Rn(l,l.stylesheets),0<l.count||0<l.imgCount?function(e){var a=setTimeout(function(){if(l.stylesheets&&Rn(l,l.stylesheets),l.unsuspend){var n=l.unsuspend;l.unsuspend=null,n()}},6e4+t);0<l.imgBytes&&Pi===0&&(Pi=62500*My());var u=setTimeout(function(){if(l.waitingForImages=!1,l.count===0&&(l.stylesheets&&Rn(l,l.stylesheets),l.unsuspend)){var n=l.unsuspend;l.unsuspend=null,n()}},(l.imgBytes>Pi?50:800)+t);return l.unsuspend=e,function(){l.unsuspend=null,clearTimeout(a),clearTimeout(u)}}:null}function Un(){if(this.count--,this.count===0&&(this.imgCount===0||!this.waitingForImages)){if(this.stylesheets)Rn(this,this.stylesheets);else if(this.unsuspend){var l=this.unsuspend;this.unsuspend=null,l()}}}var Cn=null;function Rn(l,t){l.stylesheets=null,l.unsuspend!==null&&(l.count++,Cn=new Map,t.forEach(Wy,l),Cn=null,Un.call(l))}function Wy(l,t){if(!(t.state.loading&4)){var e=Cn.get(l);if(e)var a=e.get(null);else{e=new Map,Cn.set(l,e);for(var u=l.querySelectorAll("link[data-precedence],style[data-precedence]"),n=0;n<u.length;n++){var c=u[n];(c.nodeName==="LINK"||c.getAttribute("media")!=="not all")&&(e.set(c.dataset.precedence,c),a=c)}a&&e.set(null,a)}u=t.instance,c=u.getAttribute("data-precedence"),n=e.get(c)||a,n===a&&e.set(null,u),e.set(c,u),this.count++,a=Un.bind(this),u.addEventListener("load",a),u.addEventListener("error",a),n?n.parentNode.insertBefore(u,n.nextSibling):(l=l.nodeType===9?l.head:l,l.insertBefore(u,l.firstChild)),t.state.loading|=4}}var hu={$$typeof:ul,Provider:null,Consumer:null,_currentValue:V,_currentValue2:V,_threadCount:0};function $y(l,t,e,a,u,n,c,i,f){this.tag=1,this.containerInfo=l,this.pingCache=this.current=this.pendingChildren=null,this.timeoutHandle=-1,this.callbackNode=this.next=this.pendingContext=this.context=this.cancelPendingCommit=null,this.callbackPriority=0,this.expirationTimes=Wn(-1),this.entangledLanes=this.shellSuspendCounter=this.errorRecoveryDisabledLanes=this.expiredLanes=this.warmLanes=this.pingedLanes=this.suspendedLanes=this.pendingLanes=0,this.entanglements=Wn(0),this.hiddenUpdates=Wn(null),this.identifierPrefix=a,this.onUncaughtError=u,this.onCaughtError=n,this.onRecoverableError=c,this.pooledCache=null,this.pooledCacheLanes=0,this.formState=f,this.incompleteTransitions=new Map}function no(l,t,e,a,u,n,c,i,f,r,b,A){return l=new $y(l,t,e,c,f,r,b,A,i),t=1,n===!0&&(t|=24),n=st(3,null,null,t),l.current=n,n.stateNode=l,t=Cc(),t.refCount++,l.pooledCache=t,t.refCount++,n.memoizedState={element:a,isDehydrated:e,cache:t},Bc(n),l}function co(l){return l?(l=ta,l):ta}function io(l,t,e,a,u,n){u=co(u),a.context===null?a.context=u:a.pendingContext=u,a=ce(t),a.payload={element:e},n=n===void 0?null:n,n!==null&&(a.callback=n),e=ie(l,a,t),e!==null&&(at(e,l,t),wa(e,l,t))}function fo(l,t){if(l=l.memoizedState,l!==null&&l.dehydrated!==null){var e=l.retryLane;l.retryLane=e!==0&&e<t?e:t}}function lf(l,t){fo(l,t),(l=l.alternate)&&fo(l,t)}function so(l){if(l.tag===13||l.tag===31){var t=Ne(l,67108864);t!==null&&at(t,l,67108864),lf(l,67108864)}}function oo(l){if(l.tag===13||l.tag===31){var t=rt();t=$n(t);var e=Ne(l,t);e!==null&&at(e,l,t),lf(l,t)}}var jn=!0;function ky(l,t,e,a){var u=z.T;z.T=null;var n=N.p;try{N.p=2,tf(l,t,e,a)}finally{N.p=n,z.T=u}}function Fy(l,t,e,a){var u=z.T;z.T=null;var n=N.p;try{N.p=8,tf(l,t,e,a)}finally{N.p=n,z.T=u}}function tf(l,t,e,a){if(jn){var u=ef(a);if(u===null)Xi(l,t,a,Hn,e),yo(l,a);else if(Py(u,l,t,e,a))a.stopPropagation();else if(yo(l,a),t&4&&-1<Iy.indexOf(l)){for(;u!==null;){var n=Le(u);if(n!==null)switch(n.tag){case 3:if(n=n.stateNode,n.current.memoizedState.isDehydrated){var c=pe(n.pendingLanes);if(c!==0){var i=n;for(i.pendingLanes|=2,i.entangledLanes|=2;c;){var f=1<<31-it(c);i.entanglements[1]|=f,c&=~f}Ct(n),(sl&6)===0&&(gn=nt()+500,su(0))}}break;case 31:case 13:i=Ne(n,2),i!==null&&at(i,n,2),bn(),lf(n,2)}if(n=ef(a),n===null&&Xi(l,t,a,Hn,e),n===u)break;u=n}u!==null&&a.stopPropagation()}else Xi(l,t,a,null,e)}}function ef(l){return l=uc(l),af(l)}var Hn=null;function af(l){if(Hn=null,l=Ve(l),l!==null){var t=Y(l);if(t===null)l=null;else{var e=t.tag;if(e===13){if(l=H(t),l!==null)return l;l=null}else if(e===31){if(l=R(t),l!==null)return l;l=null}else if(e===3){if(t.stateNode.current.memoizedState.isDehydrated)return t.tag===3?t.stateNode.containerInfo:null;l=null}else t!==l&&(l=null)}}return Hn=l,null}function mo(l){switch(l){case"beforetoggle":case"cancel":case"click":case"close":case"contextmenu":case"copy":case"cut":case"auxclick":case"dblclick":case"dragend":case"dragstart":case"drop":case"focusin":case"focusout":case"input":case"invalid":case"keydown":case"keypress":case"keyup":case"mousedown":case"mouseup":case"paste":case"pause":case"play":case"pointercancel":case"pointerdown":case"pointerup":case"ratechange":case"reset":case"resize":case"seeked":case"submit":case"toggle":case"touchcancel":case"touchend":case"touchstart":case"volumechange":case"change":case"selectionchange":case"textInput":case"compositionstart":case"compositionend":case"compositionupdate":case"beforeblur":case"afterblur":case"beforeinput":case"blur":case"fullscreenchange":case"focus":case"hashchange":case"popstate":case"select":case"selectstart":return 2;case"drag":case"dragenter":case"dragexit":case"dragleave":case"dragover":case"mousemove":case"mouseout":case"mouseover":case"pointermove":case"pointerout":case"pointerover":case"scroll":case"touchmove":case"wheel":case"mouseenter":case"mouseleave":case"pointerenter":case"pointerleave":return 8;case"message":switch(qo()){case bf:return 2;case zf:return 8;case Au:case Yo:return 32;case Ef:return 268435456;default:return 32}default:return 32}}var uf=!1,Se=null,be=null,ze=null,vu=new Map,gu=new Map,Ee=[],Iy="mousedown mouseup touchcancel touchend touchstart auxclick dblclick pointercancel pointerdown pointerup dragend dragstart drop compositionend compositionstart keydown keypress keyup input textInput copy cut paste click change contextmenu reset".split(" ");function yo(l,t){switch(l){case"focusin":case"focusout":Se=null;break;case"dragenter":case"dragleave":be=null;break;case"mouseover":case"mouseout":ze=null;break;case"pointerover":case"pointerout":vu.delete(t.pointerId);break;case"gotpointercapture":case"lostpointercapture":gu.delete(t.pointerId)}}function Su(l,t,e,a,u,n){return l===null||l.nativeEvent!==n?(l={blockedOn:t,domEventName:e,eventSystemFlags:a,nativeEvent:n,targetContainers:[u]},t!==null&&(t=Le(t),t!==null&&so(t)),l):(l.eventSystemFlags|=a,t=l.targetContainers,u!==null&&t.indexOf(u)===-1&&t.push(u),l)}function Py(l,t,e,a,u){switch(t){case"focusin":return Se=Su(Se,l,t,e,a,u),!0;case"dragenter":return be=Su(be,l,t,e,a,u),!0;case"mouseover":return ze=Su(ze,l,t,e,a,u),!0;case"pointerover":var n=u.pointerId;return vu.set(n,Su(vu.get(n)||null,l,t,e,a,u)),!0;case"gotpointercapture":return n=u.pointerId,gu.set(n,Su(gu.get(n)||null,l,t,e,a,u)),!0}return!1}function ro(l){var t=Ve(l.target);if(t!==null){var e=Y(t);if(e!==null){if(t=e.tag,t===13){if(t=H(e),t!==null){l.blockedOn=t,Mf(l.priority,function(){oo(e)});return}}else if(t===31){if(t=R(e),t!==null){l.blockedOn=t,Mf(l.priority,function(){oo(e)});return}}else if(t===3&&e.stateNode.current.memoizedState.isDehydrated){l.blockedOn=e.tag===3?e.stateNode.containerInfo:null;return}}}l.blockedOn=null}function Bn(l){if(l.blockedOn!==null)return!1;for(var t=l.targetContainers;0<t.length;){var e=ef(l.nativeEvent);if(e===null){e=l.nativeEvent;var a=new e.constructor(e.type,e);ac=a,e.target.dispatchEvent(a),ac=null}else return t=Le(e),t!==null&&so(t),l.blockedOn=e,!1;t.shift()}return!0}function ho(l,t,e){Bn(l)&&e.delete(t)}function lr(){uf=!1,Se!==null&&Bn(Se)&&(Se=null),be!==null&&Bn(be)&&(be=null),ze!==null&&Bn(ze)&&(ze=null),vu.forEach(ho),gu.forEach(ho)}function qn(l,t){l.blockedOn===t&&(l.blockedOn=null,uf||(uf=!0,g.unstable_scheduleCallback(g.unstable_NormalPriority,lr)))}var Yn=null;function vo(l){Yn!==l&&(Yn=l,g.unstable_scheduleCallback(g.unstable_NormalPriority,function(){Yn===l&&(Yn=null);for(var t=0;t<l.length;t+=3){var e=l[t],a=l[t+1],u=l[t+2];if(typeof a!="function"){if(af(a||e)===null)continue;break}var n=Le(e);n!==null&&(l.splice(t,3),t-=3,ei(n,{pending:!0,data:u,method:e.method,action:a},a,u))}}))}function _a(l){function t(f){return qn(f,l)}Se!==null&&qn(Se,l),be!==null&&qn(be,l),ze!==null&&qn(ze,l),vu.forEach(t),gu.forEach(t);for(var e=0;e<Ee.length;e++){var a=Ee[e];a.blockedOn===l&&(a.blockedOn=null)}for(;0<Ee.length&&(e=Ee[0],e.blockedOn===null);)ro(e),e.blockedOn===null&&Ee.shift();if(e=(l.ownerDocument||l).$$reactFormReplay,e!=null)for(a=0;a<e.length;a+=3){var u=e[a],n=e[a+1],c=u[Fl]||null;if(typeof n=="function")c||vo(e);else if(c){var i=null;if(n&&n.hasAttribute("formAction")){if(u=n,c=n[Fl]||null)i=c.formAction;else if(af(u)!==null)continue}else i=c.action;typeof i=="function"?e[a+1]=i:(e.splice(a,3),a-=3),vo(e)}}}function go(){function l(n){n.canIntercept&&n.info==="react-transition"&&n.intercept({handler:function(){return new Promise(function(c){return u=c})},focusReset:"manual",scroll:"manual"})}function t(){u!==null&&(u(),u=null),a||setTimeout(e,20)}function e(){if(!a&&!navigation.transition){var n=navigation.currentEntry;n&&n.url!=null&&navigation.navigate(n.url,{state:n.getState(),info:"react-transition",history:"replace"})}}if(typeof navigation=="object"){var a=!1,u=null;return navigation.addEventListener("navigate",l),navigation.addEventListener("navigatesuccess",t),navigation.addEventListener("navigateerror",t),setTimeout(e,100),function(){a=!0,navigation.removeEventListener("navigate",l),navigation.removeEventListener("navigatesuccess",t),navigation.removeEventListener("navigateerror",t),u!==null&&(u(),u=null)}}}function nf(l){this._internalRoot=l}Gn.prototype.render=nf.prototype.render=function(l){var t=this._internalRoot;if(t===null)throw Error(o(409));var e=t.current,a=rt();io(e,a,l,t,null,null)},Gn.prototype.unmount=nf.prototype.unmount=function(){var l=this._internalRoot;if(l!==null){this._internalRoot=null;var t=l.containerInfo;io(l.current,2,null,l,null,null),bn(),t[Ze]=null}};function Gn(l){this._internalRoot=l}Gn.prototype.unstable_scheduleHydration=function(l){if(l){var t=Of();l={blockedOn:null,target:l,priority:t};for(var e=0;e<Ee.length&&t!==0&&t<Ee[e].priority;e++);Ee.splice(e,0,l),e===0&&ro(l)}};var So=M.version;if(So!=="19.2.4")throw Error(o(527,So,"19.2.4"));N.findDOMNode=function(l){var t=l._reactInternals;if(t===void 0)throw typeof l.render=="function"?Error(o(188)):(l=Object.keys(l).join(","),Error(o(268,l)));return l=E(t),l=l!==null?G(l):null,l=l===null?null:l.stateNode,l};var tr={bundleType:0,version:"19.2.4",rendererPackageName:"react-dom",currentDispatcherRef:z,reconcilerVersion:"19.2.4"};if(typeof __REACT_DEVTOOLS_GLOBAL_HOOK__<"u"){var Qn=__REACT_DEVTOOLS_GLOBAL_HOOK__;if(!Qn.isDisabled&&Qn.supportsFiber)try{Ma=Qn.inject(tr),ct=Qn}catch{}}return zu.createRoot=function(l,t){if(!q(l))throw Error(o(299));var e=!1,a="",u=p0,n=_0,c=O0;return t!=null&&(t.unstable_strictMode===!0&&(e=!0),t.identifierPrefix!==void 0&&(a=t.identifierPrefix),t.onUncaughtError!==void 0&&(u=t.onUncaughtError),t.onCaughtError!==void 0&&(n=t.onCaughtError),t.onRecoverableError!==void 0&&(c=t.onRecoverableError)),t=no(l,1,!1,null,null,e,a,null,u,n,c,go),l[Ze]=t.current,Qi(l),new nf(t)},zu.hydrateRoot=function(l,t,e){if(!q(l))throw Error(o(299));var a=!1,u="",n=p0,c=_0,i=O0,f=null;return e!=null&&(e.unstable_strictMode===!0&&(a=!0),e.identifierPrefix!==void 0&&(u=e.identifierPrefix),e.onUncaughtError!==void 0&&(n=e.onUncaughtError),e.onCaughtError!==void 0&&(c=e.onCaughtError),e.onRecoverableError!==void 0&&(i=e.onRecoverableError),e.formState!==void 0&&(f=e.formState)),t=no(l,1,!0,t,e??null,a,u,f,n,c,i,go),t.context=co(null),e=t.current,a=rt(),a=$n(a),u=ce(a),u.callback=null,ie(e,u,a),e=a,t.current.lanes=e,Na(t,e),Ct(t),l[Ze]=t.current,Qi(l),new Gn(t)},zu.version="19.2.4",zu}var xo;function or(){if(xo)return sf.exports;xo=1;function g(){if(!(typeof __REACT_DEVTOOLS_GLOBAL_HOOK__>"u"||typeof __REACT_DEVTOOLS_GLOBAL_HOOK__.checkDCE!="function"))try{__REACT_DEVTOOLS_GLOBAL_HOOK__.checkDCE(g)}catch(M){console.error(M)}}return g(),sf.exports=dr(),sf.exports}var mr=or();const yr=[{id:"all",label:"All"},{id:"lambda",label:"Lambda"},{id:"stateful",label:"Stateful"}],rr=[{id:"all",label:"All"},{id:"og18",label:"OG-18"},{id:"advanced_reasoning",label:"Advanced Reasoning"}],vf=["backend","mode","family","quant"],hr=[{id:"reforged",label:"Reforged"},{id:"bare-vs-reforged",label:"Reforged vs Bare"},{id:"ablation",label:"Full Ablation"}],No=["reforged","bare","no_rescue","no_nudge","no_steps","no_recovery","no_compact"],rf=[{id:"all",label:"All",groupBy:[]},{id:"by-backend",label:"By Backend",groupBy:["model","quant","ablation"],intraSort:"backend"},{id:"by-family",label:"By Family",groupBy:["family"]}];async function vr(){const g=window;if(!g.__FORGE_DATA__)throw new Error("window.__FORGE_DATA__ not injected — build via `python -m tests.eval.report <jsonl> --html <out>`");return g.__FORGE_DATA__}function Do(g,M){if(M==="reforged")return g.filter(o=>o.ablation==="reforged");if(M==="bare-vs-reforged")return g.filter(o=>o.ablation==="reforged"||o.ablation==="bare");const C=new Set;for(const o of g)o.ablation.startsWith("no_")&&C.add(`${o.model}\0${o.backend}\0${o.mode}`);return g.filter(o=>C.has(`${o.model}\0${o.backend}\0${o.mode}`))}function Uo(g){const M=No.indexOf(g);return M===-1?No.length:M}function Eu(g){return g==null?"":g>=95?"text-emerald-400":g>=90?"text-emerald-500/80":g>=70?"text-amber-400":g>=50?"text-orange-400":"text-red-400"}function Xn(g,M=0){return g==null?"—":`${g.toFixed(M)}%`}const gr={0:"⁰",1:"¹",2:"²",3:"³",4:"⁴",5:"⁵",6:"⁶",7:"⁷",8:"⁸",9:"⁹"};function Sr(g){return String(g).split("").map(M=>gr[M]??M).join("")}function br(g,M,C,o,q){const Y=p=>p.endsWith("_stateful");let H=M;return C==="lambda"?H=H.filter(p=>!Y(p)):C==="stateful"&&(H=H.filter(Y)),o==="og18"?H=H.filter(p=>q[p]==="og18"):o==="advanced_reasoning"&&(H=H.filter(p=>q[p]==="advanced_reasoning")),H.length===0?{rows:g,scenarios:M}:H.length===M.length?{rows:g,scenarios:M}:{rows:g.map(p=>{let E=0,G=0,U=0,x=0,tl=0,I=0,vl=0,al=0,pl=0,$=0;for(const fl of H)E+=p.scenarioRuns?.[fl]??0,G+=p.scenarioCorrect?.[fl]??0,U+=p.scenarioCompleted?.[fl]??0,x+=p.scenarioValidated?.[fl]??0,tl+=p.scenarioIdealCalls?.[fl]??0,I+=p.scenarioActualCalls?.[fl]??0,vl+=p.scenarioWastedSum?.[fl]??0,al+=p.scenarioWastedN?.[fl]??0,pl+=p.scenarioSpeedSum?.[fl]??0,$+=p.scenarioSpeedN?.[fl]??0;const ul=fl=>Math.round(fl*10)/10,nl=E>0?ul(G/E*100):0,Gl=x>0?ul(G/x*100):null,El=E>0?ul(U/E*100):0,W=I>0?ul(Math.min(tl/I,1)*100):0,Hl=al>0?ul(vl/al):0,Kl=$>0?ul(pl/$):0,$l=p.scenarioCompleted!==void 0,kl=Math.max(0,...H.map(fl=>p.scenarioRuns?.[fl]??0));return{...p,score:nl,accuracy:$l?Gl:p.accuracy,completeness:$l?El:p.completeness,efficiency:$l?W:p.efficiency,wasted:$l?Hl:p.wasted,speed:$l?Kl:p.speed,n:kl}}),scenarios:H}}function zr(g,M,C){const o=[...g];return o.sort((q,Y)=>{let H,R;return M.col==="label"?(H=q.label,R=Y.label):C.includes(M.col)?(H=q.scenarios[M.col]??-1,R=Y.scenarios[M.col]??-1):(H=q[M.col]??-1,R=Y[M.col]??-1),typeof H=="string"&&typeof R=="string"?M.asc?H.localeCompare(R):R.localeCompare(H):M.asc?H-R:R-H}),o}function Er(g,M){return M.map(C=>String(g[C])).join("\0")}function Tr(g,M,C,o,q){const Y=q==="reforged"?M:{id:M.id,label:M.label,groupBy:["model","backend","mode"]},H=q!=="reforged";if(Y.groupBy.length===0)return{sorted:zr(g,C,o),groups:[]};const R=new Map;for(const G of g){const U=Er(G,Y.groupBy);R.has(U)||R.set(U,[]),R.get(U).push(G)}const p=[];for(const[G,U]of R){U.sort((tl,I)=>{if(H){const al=Uo(tl.ablation)-Uo(I.ablation);return al!==0?al:I.score-tl.score}const vl=I.score-tl.score;return vl!==0?vl:Y.intraSort?String(tl[Y.intraSort]).localeCompare(String(I[Y.intraSort])):0});const x=Y.groupBy.map(tl=>U[0][tl]).join(" / ");p.push({key:G,label:x,rows:U})}return p.sort((G,U)=>{const x=Math.max(...G.rows.map(I=>I.score));return Math.max(...U.rows.map(I=>I.score))-x}),{sorted:p.flatMap(G=>G.rows),groups:p}}function Ar({active:g,onChange:M}){return _.jsxs("fieldset",{className:"mb-4",children:[_.jsx("legend",{className:"text-[0.65rem] font-semibold uppercase tracking-wider text-zinc-400 px-1 mb-1",children:"Screen"}),_.jsx("div",{className:"flex flex-col rounded border border-zinc-700 overflow-hidden",children:hr.map((C,o)=>_.jsx("button",{onClick:()=>M(C.id),className:`text-xs px-2 py-1.5 text-left transition-colors ${o>0?"border-t border-zinc-700":""} ${g===C.id?"bg-emerald-500/20 text-emerald-300 font-medium":"bg-zinc-900/40 text-zinc-400 hover:bg-zinc-900/70 hover:text-zinc-200"}`,children:C.label},C.id))})]})}function pr({active:g,onChange:M}){return _.jsxs("fieldset",{className:"mb-3 border border-zinc-800 rounded p-2",children:[_.jsx("legend",{className:"text-[0.65rem] font-semibold uppercase tracking-wider text-zinc-400 px-1",children:"View"}),_.jsx("div",{className:"flex flex-wrap gap-1",children:rf.map(C=>_.jsx("button",{onClick:()=>M(C.id),className:`text-[0.65rem] px-2 py-0.5 rounded-full border transition-colors ${g===C.id?"border-emerald-500 bg-emerald-500/15 text-emerald-400":"border-zinc-700 text-zinc-500 hover:border-zinc-500 hover:text-zinc-300"}`,children:C.label},C.id))})]})}const _r={backend:"Backend",mode:"Mode",family:"Family",quant:"Quant"};function Or({rows:g,filters:M,onFilterChange:C,activeScreen:o,onScreenChange:q,activeView:Y,onViewChange:H,scenarioScope:R,onScopeChange:p,suiteScope:E,onSuiteChange:G,showRetired:U,onShowRetiredChange:x,hasRetired:tl,filteredCount:I,totalCount:vl,totalRuns:al,timestamp:pl}){return _.jsxs("nav",{className:"w-52 min-w-52 shrink-0 border-r border-zinc-800 p-4 sticky top-0 h-screen overflow-y-auto bg-zinc-950/80",children:[_.jsx("h1",{className:"text-lg font-semibold mb-0.5",children:"Forge Eval"}),_.jsxs("p",{className:"text-xs text-zinc-500 mb-3",children:[I,"/",vl," configs ·"," ",al.toLocaleString()," runs"]}),_.jsx(Ar,{active:o,onChange:q}),_.jsxs("fieldset",{className:"mb-3 border border-zinc-800 rounded p-2",children:[_.jsx("legend",{className:"text-[0.65rem] font-semibold uppercase tracking-wider text-zinc-400 px-1",children:"Suite"}),_.jsx("div",{className:"flex flex-wrap gap-1",children:rr.map($=>_.jsx("button",{onClick:()=>G($.id),className:`text-[0.65rem] px-2 py-0.5 rounded-full border transition-colors ${E===$.id?"border-emerald-500 bg-emerald-500/15 text-emerald-400":"border-zinc-700 text-zinc-500 hover:border-zinc-500 hover:text-zinc-300"}`,children:$.label},$.id))})]}),_.jsxs("fieldset",{className:"mb-3 border border-zinc-800 rounded p-2",children:[_.jsx("legend",{className:"text-[0.65rem] font-semibold uppercase tracking-wider text-zinc-400 px-1",children:"Scenarios"}),_.jsx("div",{className:"flex flex-wrap gap-1",children:yr.map($=>_.jsx("button",{onClick:()=>p($.id),className:`text-[0.65rem] px-2 py-0.5 rounded-full border transition-colors ${R===$.id?"border-emerald-500 bg-emerald-500/15 text-emerald-400":"border-zinc-700 text-zinc-500 hover:border-zinc-500 hover:text-zinc-300"}`,children:$.label},$.id))})]}),o==="reforged"&&_.jsx(pr,{active:Y,onChange:H}),vf.map($=>{const ul=[...new Set(g.map(nl=>nl[$]))].sort();return ul.length<2?null:_.jsxs("fieldset",{className:"mb-3 border border-zinc-800 rounded p-2",children:[_.jsx("legend",{className:"text-[0.65rem] font-semibold uppercase tracking-wider text-zinc-400 px-1",children:_r[$]}),ul.map(nl=>_.jsxs("label",{className:"flex items-center gap-1.5 text-xs py-0.5 cursor-pointer hover:text-zinc-200",children:[_.jsx("input",{type:"checkbox",checked:M[$]?.has(nl)??!0,onChange:Gl=>C($,nl,Gl.target.checked),className:"w-3.5 h-3.5 rounded border-zinc-600 bg-zinc-800 accent-emerald-500"}),_.jsx("span",{children:nl})]},nl))]},$)}),tl&&_.jsxs("label",{className:"flex items-center gap-1.5 text-xs py-0.5 mt-2 cursor-pointer text-zinc-500 hover:text-zinc-300",children:[_.jsx("input",{type:"checkbox",checked:U,onChange:$=>x($.target.checked),className:"w-3.5 h-3.5 rounded border-zinc-600 bg-zinc-800 accent-emerald-500"}),_.jsx("span",{children:"Show retired models"})]}),_.jsx("p",{className:"text-[0.6rem] text-zinc-600 mt-4",children:pl})]})}const Co=[{key:"score",label:"Scr%"},{key:"accuracy",label:"Acc%"},{key:"completeness",label:"Cmp%"},{key:"efficiency",label:"Eff%"},{key:"wasted",label:"Wst"},{key:"speed",label:"Spd"},{key:"n",label:"N"}];function yf({col:g,sort:M}){return M.col!==g?null:_.jsx("span",{className:"ml-0.5 text-emerald-400",children:M.asc?"▲":"▼"})}function Mr({rows:g,scenarios:M,scenarioAbbrev:C,sort:o,onSort:q,checked:Y,onCompareToggle:H,groups:R,maxGen:p,genInfo:E}){const G=new Map;if(R.length>0){let x=0;for(const tl of R)G.set(x,tl.label),x+=tl.rows.length}const U=2+Co.length+M.length;return _.jsx("div",{className:"w-full overflow-x-auto",children:_.jsxs("table",{className:"text-xs whitespace-nowrap border-collapse",children:[_.jsx("thead",{children:_.jsxs("tr",{className:"border-b border-zinc-800",children:[_.jsx("th",{className:"p-1.5 w-8"}),_.jsxs("th",{className:"p-1.5 text-left cursor-pointer select-none hover:text-emerald-400 sticky left-0 bg-zinc-950 z-10",onClick:()=>q("label"),children:["Model/Backend",_.jsx(yf,{col:"label",sort:o})]}),Co.map(x=>_.jsxs("th",{className:"p-1.5 text-right cursor-pointer select-none hover:text-emerald-400",onClick:()=>q(x.key),children:[x.label,_.jsx(yf,{col:x.key,sort:o})]},x.key)),M.map(x=>_.jsxs("th",{className:"p-1.5 text-right cursor-pointer select-none hover:text-emerald-400",onClick:()=>q(x),title:x,children:[C[x]||x.slice(0,3),_.jsx(yf,{col:x,sort:o})]},x))]})}),_.jsx("tbody",{children:g.map((x,tl)=>{const I=Y.includes(tl),vl=G.get(tl);return _.jsxs(ml.Fragment,{children:[vl!=null&&_.jsx("tr",{className:"bg-zinc-900/30",children:_.jsx("td",{colSpan:U,className:"px-2 py-1 text-[0.6rem] font-semibold text-zinc-400 uppercase tracking-wider border-t border-zinc-700/50",children:vl})}),_.jsxs("tr",{className:`border-b border-zinc-900 hover:bg-zinc-900/50 transition-colors ${I?"bg-zinc-800/40":""} ${x.retired?"opacity-60":""}`,children:[_.jsx("td",{className:"p-1.5 text-center",children:_.jsx("input",{type:"checkbox",checked:I,onChange:al=>H(tl,al.target.checked),className:"w-3.5 h-3.5 rounded border-zinc-600 bg-zinc-800 accent-emerald-500 cursor-pointer"})}),_.jsxs("td",{className:"p-1.5 font-mono sticky left-0 bg-zinc-950 z-10",children:[x.label,p>0&&x.gen<p&&_.jsx("sup",{className:"ml-0.5 text-zinc-500",title:(()=>{const al=E?.[String(x.gen)];return al?`gen ${x.gen}: ${al.note} (commit ${al.commit}, ${al.date})`:`gen ${x.gen}`})(),children:Sr(x.gen)}),x.retired&&_.jsx("span",{className:"ml-1.5 align-middle text-[0.55rem] uppercase tracking-wider text-zinc-500 border border-zinc-700 rounded px-1",children:"retired"})]}),_.jsx("td",{className:`p-1.5 text-right tabular-nums ${Eu(x.score)}`,children:Xn(x.score,1)}),_.jsx("td",{className:`p-1.5 text-right tabular-nums ${Eu(x.accuracy)}`,children:Xn(x.accuracy,1)}),_.jsx("td",{className:`p-1.5 text-right tabular-nums ${Eu(x.completeness)}`,children:Xn(x.completeness,1)}),_.jsx("td",{className:`p-1.5 text-right tabular-nums ${Eu(x.efficiency)}`,children:Xn(x.efficiency)}),_.jsx("td",{className:"p-1.5 text-right tabular-nums text-zinc-400",children:x.wasted.toFixed(1)}),_.jsxs("td",{className:"p-1.5 text-right tabular-nums text-zinc-400",children:[x.speed.toFixed(1),"s"]}),_.jsx("td",{className:"p-1.5 text-right tabular-nums text-zinc-500",children:x.n}),M.map(al=>{const pl=x.scenarios[al],$=x.scenarioRuns?.[al]??0;let ul,nl;return pl!=null?(ul=String(pl),nl=Eu(pl)):$===0?(ul="I",nl="text-zinc-700"):(ul="—",nl="text-zinc-600"),_.jsx("td",{className:`p-1.5 text-right tabular-nums ${nl}`,children:ul},al)})]})]},x.label)})})]})})}const xr=[{key:"score",label:"Score",fmt:g=>g==null?"—":`${g.toFixed(1)}%`,higherBetter:!0},{key:"accuracy",label:"Accuracy",fmt:g=>g==null?"—":`${g.toFixed(1)}%`,higherBetter:!0},{key:"completeness",label:"Completeness",fmt:g=>g==null?"—":`${g.toFixed(1)}%`,higherBetter:!0},{key:"efficiency",label:"Efficiency",fmt:g=>g==null?"—":`${g.toFixed(1)}%`,higherBetter:!0},{key:"wasted",label:"Avg Wasted",fmt:g=>g==null?"—":g.toFixed(1),higherBetter:!1},{key:"speed",label:"Speed",fmt:g=>g==null?"—":`${g.toFixed(1)}s`,higherBetter:!1}];function Ro({va:g,vb:M,higherBetter:C}){if(g==null||M==null)return _.jsx("td",{className:"p-1.5 text-right text-zinc-600",children:"—"});const o=M-g,Y=(o>0?"+":"")+(Number.isInteger(o)?o:o.toFixed(1));let H="text-zinc-500";return o!==0&&(H=o>0===C?"text-emerald-400":"text-red-400"),_.jsx("td",{className:`p-1.5 text-right tabular-nums font-medium ${H}`,children:Y})}function Nr({a:g,b:M,scenarios:C,scenarioAbbrev:o,onSwap:q,onClear:Y}){const H=(R,p)=>p in R.scenarios?R.scenarios[p]:R[p]??null;return _.jsxs("div",{className:"mt-6 border border-zinc-800 rounded-lg p-4 max-w-2xl",children:[_.jsxs("div",{className:"flex items-center justify-between mb-3",children:[_.jsx("h3",{className:"text-sm font-semibold",children:"Compare"}),_.jsxs("div",{className:"flex gap-2",children:[_.jsx("button",{onClick:q,className:"text-xs px-2.5 py-1 rounded border border-zinc-700 hover:border-zinc-500 transition-colors",children:"Swap A↔B"}),_.jsx("button",{onClick:Y,className:"text-xs px-2.5 py-1 rounded border border-zinc-700 hover:border-red-500/50 hover:text-red-400 transition-colors",children:"Clear"})]})]}),_.jsxs("table",{className:"text-xs w-full border-collapse",children:[_.jsx("thead",{children:_.jsxs("tr",{className:"border-b border-zinc-800",children:[_.jsx("th",{className:"p-1.5 text-left text-zinc-500",children:"Metric"}),_.jsx("th",{className:"p-1.5 text-right text-zinc-400 max-w-48 truncate",title:g.label,children:g.label}),_.jsx("th",{className:"p-1.5 text-right text-zinc-500 w-16",children:"Delta"}),_.jsx("th",{className:"p-1.5 text-right text-zinc-400 max-w-48 truncate",title:M.label,children:M.label})]})}),_.jsxs("tbody",{children:[xr.map(R=>{const p=H(g,R.key),E=H(M,R.key);return _.jsxs("tr",{className:"border-b border-zinc-900/50",children:[_.jsx("td",{className:"p-1.5 text-zinc-400",children:R.label}),_.jsx("td",{className:"p-1.5 text-right tabular-nums",children:R.fmt(p)}),_.jsx(Ro,{va:p,vb:E,higherBetter:R.higherBetter}),_.jsx("td",{className:"p-1.5 text-right tabular-nums",children:R.fmt(E)})]},R.key)}),_.jsx("tr",{children:_.jsx("td",{colSpan:4,className:"py-1",children:_.jsx("div",{className:"border-t border-zinc-800"})})}),C.map(R=>{const p=g.scenarios[R],E=M.scenarios[R],G=(U,x)=>U!=null?`${U}%`:(x.scenarioRuns?.[R]??0)===0?"I":"—";return _.jsxs("tr",{className:"border-b border-zinc-900/50",children:[_.jsx("td",{className:"p-1.5 text-zinc-500",children:o[R]||R}),_.jsx("td",{className:"p-1.5 text-right tabular-nums",children:G(p,g)}),_.jsx(Ro,{va:p,vb:E,higherBetter:!0}),_.jsx("td",{className:"p-1.5 text-right tabular-nums",children:G(E,M)})]},R)})]})]})]})}function Dr(g){const M={};for(const C of vf)M[C]=new Set(g.map(o=>o[C]));return M}function Ur(){const[g,M]=ml.useState(null),[C,o]=ml.useState(null),[q,Y]=ml.useState({col:"score",asc:!1}),[H,R]=ml.useState([]),[p,E]=ml.useState("reforged"),[G,U]=ml.useState("all"),[x,tl]=ml.useState("all"),[I,vl]=ml.useState("all"),[al,pl]=ml.useState(!1);ml.useEffect(()=>{vr().then(v=>{M(v),o(Dr(v.rows))})},[]);const $=ml.useMemo(()=>g?al?g.rows:g.rows.filter(v=>!v.retired):[],[g,al]),ul=ml.useMemo(()=>g?g.rows.some(v=>v.retired):!1,[g]),nl=ml.useMemo(()=>!g||!C?[]:Do($,p).filter(O=>vf.every(D=>!C[D]||C[D].has(O[D]))),[g,C,p,$]),{rows:Gl,scenarios:El}=ml.useMemo(()=>br(nl,g?.scenarios??[],x,I,g?.scenarioSuite??{}),[nl,g,x,I]),W=ml.useMemo(()=>{if(!g)return{};const v=g.scenarioAbbrev,O=new Set(El),D={};for(const[X,K]of Object.entries(v))O.has(X)&&(D[X]=K);return D},[g,El]),Hl=ml.useMemo(()=>rf.find(v=>v.id===G)??rf[0],[G]),{sorted:Kl,groups:$l}=ml.useMemo(()=>Tr(Gl,Hl,q,El,p),[Gl,Hl,q,El,p]),kl=ml.useMemo(()=>nl.reduce((v,O)=>v+O.n*El.length,0),[nl,El]),fl=ml.useCallback((v,O,D)=>{o(X=>{if(!X)return X;const K={...X,[v]:new Set(X[v])};return D?K[v].add(O):K[v].delete(O),K}),R([])},[]),Rt=ml.useCallback(v=>{E(v),R([])},[]),_t=ml.useCallback(v=>{U(v),R([])},[]),ut=ml.useCallback(v=>{tl(v),R([])},[]),z=ml.useCallback(v=>{vl(v),R([])},[]),N=ml.useCallback(v=>{pl(v),R([])},[]),V=ml.useCallback(v=>{Y(O=>O.col===v?{col:v,asc:!O.asc}:{col:v,asc:v==="label"})},[]),dl=ml.useCallback((v,O)=>{R(D=>O?D.length>=2?[D[1],v]:[...D,v]:D.filter(X=>X!==v))},[]),yl=ml.useCallback(()=>{R(v=>[...v].reverse())},[]),d=ml.useCallback(()=>{R([])},[]);return!g||!C?_.jsx("div",{className:"flex items-center justify-center min-h-screen text-zinc-500",children:"Loading..."}):_.jsxs("div",{className:"flex min-h-screen",children:[_.jsx(Or,{rows:$,filters:C,onFilterChange:fl,activeScreen:p,onScreenChange:Rt,activeView:G,onViewChange:_t,scenarioScope:x,onScopeChange:ut,suiteScope:I,onSuiteChange:z,showRetired:al,onShowRetiredChange:N,hasRetired:ul,filteredCount:nl.length,totalCount:Do($,p).length,totalRuns:kl,timestamp:g.timestamp}),_.jsxs("main",{className:"flex-1 min-w-0 p-4 flex flex-col",children:[_.jsx(Mr,{rows:Kl,scenarios:El,scenarioAbbrev:W,sort:q,onSort:V,checked:H,onCompareToggle:dl,groups:$l,maxGen:g.maxGen??0,genInfo:g.genInfo}),H.length===2&&_.jsx(Nr,{a:Kl[H[0]],b:Kl[H[1]],scenarios:El,scenarioAbbrev:W,onSwap:yl,onClear:d}),_.jsxs("p",{className:"text-[0.6rem] text-zinc-600 mt-6",children:["Generated ",g.timestamp]})]})]})}mr.createRoot(document.getElementById("root")).render(_.jsx(ml.StrictMode,{children:_.jsx(Ur,{})}));</script>
+`+a.stack}}var Jn=Object.prototype.hasOwnProperty,wn=h.unstable_scheduleCallback,Wn=h.unstable_cancelCallback,qo=h.unstable_shouldYield,Yo=h.unstable_requestPaint,nt=h.unstable_now,Go=h.unstable_getCurrentPriorityLevel,zf=h.unstable_ImmediatePriority,Ef=h.unstable_UserBlockingPriority,Au=h.unstable_NormalPriority,Qo=h.unstable_LowPriority,Tf=h.unstable_IdlePriority,Xo=h.log,Zo=h.unstable_setDisableYieldValue,Ma=null,ct=null;function It(l){if(typeof Xo=="function"&&Zo(l),ct&&typeof ct.setStrictMode=="function")try{ct.setStrictMode(Ma,l)}catch{}}var it=Math.clz32?Math.clz32:Ko,Vo=Math.log,Lo=Math.LN2;function Ko(l){return l>>>=0,l===0?32:31-(Vo(l)/Lo|0)|0}var pu=256,_u=262144,Ou=4194304;function pe(l){var t=l&42;if(t!==0)return t;switch(l&-l){case 1:return 1;case 2:return 2;case 4:return 4;case 8:return 8;case 16:return 16;case 32:return 32;case 64:return 64;case 128:return 128;case 256:case 512:case 1024:case 2048:case 4096:case 8192:case 16384:case 32768:case 65536:case 131072:return l&261888;case 262144:case 524288:case 1048576:case 2097152:return l&3932160;case 4194304:case 8388608:case 16777216:case 33554432:return l&62914560;case 67108864:return 67108864;case 134217728:return 134217728;case 268435456:return 268435456;case 536870912:return 536870912;case 1073741824:return 0;default:return l}}function Mu(l,t,e){var a=l.pendingLanes;if(a===0)return 0;var u=0,n=l.suspendedLanes,c=l.pingedLanes;l=l.warmLanes;var i=a&134217727;return i!==0?(a=i&~n,a!==0?u=pe(a):(c&=i,c!==0?u=pe(c):e||(e=i&~l,e!==0&&(u=pe(e))))):(i=a&~n,i!==0?u=pe(i):c!==0?u=pe(c):e||(e=a&~l,e!==0&&(u=pe(e)))),u===0?0:t!==0&&t!==u&&(t&n)===0&&(n=u&-u,e=t&-t,n>=e||n===32&&(e&4194048)!==0)?t:u}function xa(l,t){return(l.pendingLanes&~(l.suspendedLanes&~l.pingedLanes)&t)===0}function Jo(l,t){switch(l){case 1:case 2:case 4:case 8:case 64:return t+250;case 16:case 32:case 128:case 256:case 512:case 1024:case 2048:case 4096:case 8192:case 16384:case 32768:case 65536:case 131072:case 262144:case 524288:case 1048576:case 2097152:return t+5e3;case 4194304:case 8388608:case 16777216:case 33554432:return-1;case 67108864:case 134217728:case 268435456:case 536870912:case 1073741824:return-1;default:return-1}}function Af(){var l=Ou;return Ou<<=1,(Ou&62914560)===0&&(Ou=4194304),l}function $n(l){for(var t=[],e=0;31>e;e++)t.push(l);return t}function Da(l,t){l.pendingLanes|=t,t!==268435456&&(l.suspendedLanes=0,l.pingedLanes=0,l.warmLanes=0)}function wo(l,t,e,a,u,n){var c=l.pendingLanes;l.pendingLanes=e,l.suspendedLanes=0,l.pingedLanes=0,l.warmLanes=0,l.expiredLanes&=e,l.entangledLanes&=e,l.errorRecoveryDisabledLanes&=e,l.shellSuspendCounter=0;var i=l.entanglements,f=l.expirationTimes,r=l.hiddenUpdates;for(e=c&~e;0<e;){var b=31-it(e),A=1<<b;i[b]=0,f[b]=-1;var v=r[b];if(v!==null)for(r[b]=null,b=0;b<v.length;b++){var S=v[b];S!==null&&(S.lane&=-536870913)}e&=~A}a!==0&&pf(l,a,0),n!==0&&u===0&&l.tag!==0&&(l.suspendedLanes|=n&~(c&~t))}function pf(l,t,e){l.pendingLanes|=t,l.suspendedLanes&=~t;var a=31-it(t);l.entangledLanes|=t,l.entanglements[a]=l.entanglements[a]|1073741824|e&261930}function _f(l,t){var e=l.entangledLanes|=t;for(l=l.entanglements;e;){var a=31-it(e),u=1<<a;u&t|l[a]&t&&(l[a]|=t),e&=~u}}function Of(l,t){var e=t&-t;return e=(e&42)!==0?1:kn(e),(e&(l.suspendedLanes|t))!==0?0:e}function kn(l){switch(l){case 2:l=1;break;case 8:l=4;break;case 32:l=16;break;case 256:case 512:case 1024:case 2048:case 4096:case 8192:case 16384:case 32768:case 65536:case 131072:case 262144:case 524288:case 1048576:case 2097152:case 4194304:case 8388608:case 16777216:case 33554432:l=128;break;case 268435456:l=134217728;break;default:l=0}return l}function Fn(l){return l&=-l,2<l?8<l?(l&134217727)!==0?32:268435456:8:2}function Mf(){var l=D.p;return l!==0?l:(l=window.event,l===void 0?32:yo(l.type))}function xf(l,t){var e=D.p;try{return D.p=l,t()}finally{D.p=e}}var Pt=Math.random().toString(36).slice(2),Ql="__reactFiber$"+Pt,Fl="__reactProps$"+Pt,Ze="__reactContainer$"+Pt,In="__reactEvents$"+Pt,Wo="__reactListeners$"+Pt,$o="__reactHandles$"+Pt,Df="__reactResources$"+Pt,Na="__reactMarker$"+Pt;function Pn(l){delete l[Ql],delete l[Fl],delete l[In],delete l[Wo],delete l[$o]}function Ve(l){var t=l[Ql];if(t)return t;for(var e=l.parentNode;e;){if(t=e[Ze]||e[Ql]){if(e=t.alternate,t.child!==null||e!==null&&e.child!==null)for(l=kd(l);l!==null;){if(e=l[Ql])return e;l=kd(l)}return t}l=e,e=l.parentNode}return null}function Le(l){if(l=l[Ql]||l[Ze]){var t=l.tag;if(t===5||t===6||t===13||t===31||t===26||t===27||t===3)return l}return null}function Ua(l){var t=l.tag;if(t===5||t===26||t===27||t===6)return l.stateNode;throw Error(o(33))}function Ke(l){var t=l[Df];return t||(t=l[Df]={hoistableStyles:new Map,hoistableScripts:new Map}),t}function Yl(l){l[Na]=!0}var Nf=new Set,Uf={};function _e(l,t){Je(l,t),Je(l+"Capture",t)}function Je(l,t){for(Uf[l]=t,l=0;l<t.length;l++)Nf.add(t[l])}var ko=RegExp("^[:A-Z_a-z\\u00C0-\\u00D6\\u00D8-\\u00F6\\u00F8-\\u02FF\\u0370-\\u037D\\u037F-\\u1FFF\\u200C-\\u200D\\u2070-\\u218F\\u2C00-\\u2FEF\\u3001-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFFD][:A-Z_a-z\\u00C0-\\u00D6\\u00D8-\\u00F6\\u00F8-\\u02FF\\u0370-\\u037D\\u037F-\\u1FFF\\u200C-\\u200D\\u2070-\\u218F\\u2C00-\\u2FEF\\u3001-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFFD\\-.0-9\\u00B7\\u0300-\\u036F\\u203F-\\u2040]*$"),Cf={},Rf={};function Fo(l){return Jn.call(Rf,l)?!0:Jn.call(Cf,l)?!1:ko.test(l)?Rf[l]=!0:(Cf[l]=!0,!1)}function xu(l,t,e){if(Fo(t))if(e===null)l.removeAttribute(t);else{switch(typeof e){case"undefined":case"function":case"symbol":l.removeAttribute(t);return;case"boolean":var a=t.toLowerCase().slice(0,5);if(a!=="data-"&&a!=="aria-"){l.removeAttribute(t);return}}l.setAttribute(t,""+e)}}function Du(l,t,e){if(e===null)l.removeAttribute(t);else{switch(typeof e){case"undefined":case"function":case"symbol":case"boolean":l.removeAttribute(t);return}l.setAttribute(t,""+e)}}function jt(l,t,e,a){if(a===null)l.removeAttribute(e);else{switch(typeof a){case"undefined":case"function":case"symbol":case"boolean":l.removeAttribute(e);return}l.setAttributeNS(t,e,""+a)}}function ht(l){switch(typeof l){case"bigint":case"boolean":case"number":case"string":case"undefined":return l;case"object":return l;default:return""}}function jf(l){var t=l.type;return(l=l.nodeName)&&l.toLowerCase()==="input"&&(t==="checkbox"||t==="radio")}function Io(l,t,e){var a=Object.getOwnPropertyDescriptor(l.constructor.prototype,t);if(!l.hasOwnProperty(t)&&typeof a<"u"&&typeof a.get=="function"&&typeof a.set=="function"){var u=a.get,n=a.set;return Object.defineProperty(l,t,{configurable:!0,get:function(){return u.call(this)},set:function(c){e=""+c,n.call(this,c)}}),Object.defineProperty(l,t,{enumerable:a.enumerable}),{getValue:function(){return e},setValue:function(c){e=""+c},stopTracking:function(){l._valueTracker=null,delete l[t]}}}}function lc(l){if(!l._valueTracker){var t=jf(l)?"checked":"value";l._valueTracker=Io(l,t,""+l[t])}}function Hf(l){if(!l)return!1;var t=l._valueTracker;if(!t)return!0;var e=t.getValue(),a="";return l&&(a=jf(l)?l.checked?"true":"false":l.value),l=a,l!==e?(t.setValue(l),!0):!1}function Nu(l){if(l=l||(typeof document<"u"?document:void 0),typeof l>"u")return null;try{return l.activeElement||l.body}catch{return l.body}}var Po=/[\n"\\]/g;function vt(l){return l.replace(Po,function(t){return"\\"+t.charCodeAt(0).toString(16)+" "})}function tc(l,t,e,a,u,n,c,i){l.name="",c!=null&&typeof c!="function"&&typeof c!="symbol"&&typeof c!="boolean"?l.type=c:l.removeAttribute("type"),t!=null?c==="number"?(t===0&&l.value===""||l.value!=t)&&(l.value=""+ht(t)):l.value!==""+ht(t)&&(l.value=""+ht(t)):c!=="submit"&&c!=="reset"||l.removeAttribute("value"),t!=null?ec(l,c,ht(t)):e!=null?ec(l,c,ht(e)):a!=null&&l.removeAttribute("value"),u==null&&n!=null&&(l.defaultChecked=!!n),u!=null&&(l.checked=u&&typeof u!="function"&&typeof u!="symbol"),i!=null&&typeof i!="function"&&typeof i!="symbol"&&typeof i!="boolean"?l.name=""+ht(i):l.removeAttribute("name")}function Bf(l,t,e,a,u,n,c,i){if(n!=null&&typeof n!="function"&&typeof n!="symbol"&&typeof n!="boolean"&&(l.type=n),t!=null||e!=null){if(!(n!=="submit"&&n!=="reset"||t!=null)){lc(l);return}e=e!=null?""+ht(e):"",t=t!=null?""+ht(t):e,i||t===l.value||(l.value=t),l.defaultValue=t}a=a??u,a=typeof a!="function"&&typeof a!="symbol"&&!!a,l.checked=i?l.checked:!!a,l.defaultChecked=!!a,c!=null&&typeof c!="function"&&typeof c!="symbol"&&typeof c!="boolean"&&(l.name=c),lc(l)}function ec(l,t,e){t==="number"&&Nu(l.ownerDocument)===l||l.defaultValue===""+e||(l.defaultValue=""+e)}function we(l,t,e,a){if(l=l.options,t){t={};for(var u=0;u<e.length;u++)t["$"+e[u]]=!0;for(e=0;e<l.length;e++)u=t.hasOwnProperty("$"+l[e].value),l[e].selected!==u&&(l[e].selected=u),u&&a&&(l[e].defaultSelected=!0)}else{for(e=""+ht(e),t=null,u=0;u<l.length;u++){if(l[u].value===e){l[u].selected=!0,a&&(l[u].defaultSelected=!0);return}t!==null||l[u].disabled||(t=l[u])}t!==null&&(t.selected=!0)}}function qf(l,t,e){if(t!=null&&(t=""+ht(t),t!==l.value&&(l.value=t),e==null)){l.defaultValue!==t&&(l.defaultValue=t);return}l.defaultValue=e!=null?""+ht(e):""}function Yf(l,t,e,a){if(t==null){if(a!=null){if(e!=null)throw Error(o(92));if(ut(a)){if(1<a.length)throw Error(o(93));a=a[0]}e=a}e==null&&(e=""),t=e}e=ht(t),l.defaultValue=e,a=l.textContent,a===e&&a!==""&&a!==null&&(l.value=a),lc(l)}function We(l,t){if(t){var e=l.firstChild;if(e&&e===l.lastChild&&e.nodeType===3){e.nodeValue=t;return}}l.textContent=t}var lm=new Set("animationIterationCount aspectRatio borderImageOutset borderImageSlice borderImageWidth boxFlex boxFlexGroup boxOrdinalGroup columnCount columns flex flexGrow flexPositive flexShrink flexNegative flexOrder gridArea gridRow gridRowEnd gridRowSpan gridRowStart gridColumn gridColumnEnd gridColumnSpan gridColumnStart fontWeight lineClamp lineHeight opacity order orphans scale tabSize widows zIndex zoom fillOpacity floodOpacity stopOpacity strokeDasharray strokeDashoffset strokeMiterlimit strokeOpacity strokeWidth MozAnimationIterationCount MozBoxFlex MozBoxFlexGroup MozLineClamp msAnimationIterationCount msFlex msZoom msFlexGrow msFlexNegative msFlexOrder msFlexPositive msFlexShrink msGridColumn msGridColumnSpan msGridRow msGridRowSpan WebkitAnimationIterationCount WebkitBoxFlex WebKitBoxFlexGroup WebkitBoxOrdinalGroup WebkitColumnCount WebkitColumns WebkitFlex WebkitFlexGrow WebkitFlexPositive WebkitFlexShrink WebkitLineClamp".split(" "));function Gf(l,t,e){var a=t.indexOf("--")===0;e==null||typeof e=="boolean"||e===""?a?l.setProperty(t,""):t==="float"?l.cssFloat="":l[t]="":a?l.setProperty(t,e):typeof e!="number"||e===0||lm.has(t)?t==="float"?l.cssFloat=e:l[t]=(""+e).trim():l[t]=e+"px"}function Qf(l,t,e){if(t!=null&&typeof t!="object")throw Error(o(62));if(l=l.style,e!=null){for(var a in e)!e.hasOwnProperty(a)||t!=null&&t.hasOwnProperty(a)||(a.indexOf("--")===0?l.setProperty(a,""):a==="float"?l.cssFloat="":l[a]="");for(var u in t)a=t[u],t.hasOwnProperty(u)&&e[u]!==a&&Gf(l,u,a)}else for(var n in t)t.hasOwnProperty(n)&&Gf(l,n,t[n])}function ac(l){if(l.indexOf("-")===-1)return!1;switch(l){case"annotation-xml":case"color-profile":case"font-face":case"font-face-src":case"font-face-uri":case"font-face-format":case"font-face-name":case"missing-glyph":return!1;default:return!0}}var tm=new Map([["acceptCharset","accept-charset"],["htmlFor","for"],["httpEquiv","http-equiv"],["crossOrigin","crossorigin"],["accentHeight","accent-height"],["alignmentBaseline","alignment-baseline"],["arabicForm","arabic-form"],["baselineShift","baseline-shift"],["capHeight","cap-height"],["clipPath","clip-path"],["clipRule","clip-rule"],["colorInterpolation","color-interpolation"],["colorInterpolationFilters","color-interpolation-filters"],["colorProfile","color-profile"],["colorRendering","color-rendering"],["dominantBaseline","dominant-baseline"],["enableBackground","enable-background"],["fillOpacity","fill-opacity"],["fillRule","fill-rule"],["floodColor","flood-color"],["floodOpacity","flood-opacity"],["fontFamily","font-family"],["fontSize","font-size"],["fontSizeAdjust","font-size-adjust"],["fontStretch","font-stretch"],["fontStyle","font-style"],["fontVariant","font-variant"],["fontWeight","font-weight"],["glyphName","glyph-name"],["glyphOrientationHorizontal","glyph-orientation-horizontal"],["glyphOrientationVertical","glyph-orientation-vertical"],["horizAdvX","horiz-adv-x"],["horizOriginX","horiz-origin-x"],["imageRendering","image-rendering"],["letterSpacing","letter-spacing"],["lightingColor","lighting-color"],["markerEnd","marker-end"],["markerMid","marker-mid"],["markerStart","marker-start"],["overlinePosition","overline-position"],["overlineThickness","overline-thickness"],["paintOrder","paint-order"],["panose-1","panose-1"],["pointerEvents","pointer-events"],["renderingIntent","rendering-intent"],["shapeRendering","shape-rendering"],["stopColor","stop-color"],["stopOpacity","stop-opacity"],["strikethroughPosition","strikethrough-position"],["strikethroughThickness","strikethrough-thickness"],["strokeDasharray","stroke-dasharray"],["strokeDashoffset","stroke-dashoffset"],["strokeLinecap","stroke-linecap"],["strokeLinejoin","stroke-linejoin"],["strokeMiterlimit","stroke-miterlimit"],["strokeOpacity","stroke-opacity"],["strokeWidth","stroke-width"],["textAnchor","text-anchor"],["textDecoration","text-decoration"],["textRendering","text-rendering"],["transformOrigin","transform-origin"],["underlinePosition","underline-position"],["underlineThickness","underline-thickness"],["unicodeBidi","unicode-bidi"],["unicodeRange","unicode-range"],["unitsPerEm","units-per-em"],["vAlphabetic","v-alphabetic"],["vHanging","v-hanging"],["vIdeographic","v-ideographic"],["vMathematical","v-mathematical"],["vectorEffect","vector-effect"],["vertAdvY","vert-adv-y"],["vertOriginX","vert-origin-x"],["vertOriginY","vert-origin-y"],["wordSpacing","word-spacing"],["writingMode","writing-mode"],["xmlnsXlink","xmlns:xlink"],["xHeight","x-height"]]),em=/^[\u0000-\u001F ]*j[\r\n\t]*a[\r\n\t]*v[\r\n\t]*a[\r\n\t]*s[\r\n\t]*c[\r\n\t]*r[\r\n\t]*i[\r\n\t]*p[\r\n\t]*t[\r\n\t]*:/i;function Uu(l){return em.test(""+l)?"javascript:throw new Error('React has blocked a javascript: URL as a security precaution.')":l}function Ht(){}var uc=null;function nc(l){return l=l.target||l.srcElement||window,l.correspondingUseElement&&(l=l.correspondingUseElement),l.nodeType===3?l.parentNode:l}var $e=null,ke=null;function Xf(l){var t=Le(l);if(t&&(l=t.stateNode)){var e=l[Fl]||null;l:switch(l=t.stateNode,t.type){case"input":if(tc(l,e.value,e.defaultValue,e.defaultValue,e.checked,e.defaultChecked,e.type,e.name),t=e.name,e.type==="radio"&&t!=null){for(e=l;e.parentNode;)e=e.parentNode;for(e=e.querySelectorAll('input[name="'+vt(""+t)+'"][type="radio"]'),t=0;t<e.length;t++){var a=e[t];if(a!==l&&a.form===l.form){var u=a[Fl]||null;if(!u)throw Error(o(90));tc(a,u.value,u.defaultValue,u.defaultValue,u.checked,u.defaultChecked,u.type,u.name)}}for(t=0;t<e.length;t++)a=e[t],a.form===l.form&&Hf(a)}break l;case"textarea":qf(l,e.value,e.defaultValue);break l;case"select":t=e.value,t!=null&&we(l,!!e.multiple,t,!1)}}}var cc=!1;function Zf(l,t,e){if(cc)return l(t,e);cc=!0;try{var a=l(t);return a}finally{if(cc=!1,($e!==null||ke!==null)&&(bn(),$e&&(t=$e,l=ke,ke=$e=null,Xf(t),l)))for(t=0;t<l.length;t++)Xf(l[t])}}function Ca(l,t){var e=l.stateNode;if(e===null)return null;var a=e[Fl]||null;if(a===null)return null;e=a[t];l:switch(t){case"onClick":case"onClickCapture":case"onDoubleClick":case"onDoubleClickCapture":case"onMouseDown":case"onMouseDownCapture":case"onMouseMove":case"onMouseMoveCapture":case"onMouseUp":case"onMouseUpCapture":case"onMouseEnter":(a=!a.disabled)||(l=l.type,a=!(l==="button"||l==="input"||l==="select"||l==="textarea")),l=!a;break l;default:l=!1}if(l)return null;if(e&&typeof e!="function")throw Error(o(231,t,typeof e));return e}var Bt=!(typeof window>"u"||typeof window.document>"u"||typeof window.document.createElement>"u"),ic=!1;if(Bt)try{var Ra={};Object.defineProperty(Ra,"passive",{get:function(){ic=!0}}),window.addEventListener("test",Ra,Ra),window.removeEventListener("test",Ra,Ra)}catch{ic=!1}var le=null,fc=null,Cu=null;function Vf(){if(Cu)return Cu;var l,t=fc,e=t.length,a,u="value"in le?le.value:le.textContent,n=u.length;for(l=0;l<e&&t[l]===u[l];l++);var c=e-l;for(a=1;a<=c&&t[e-a]===u[n-a];a++);return Cu=u.slice(l,1<a?1-a:void 0)}function Ru(l){var t=l.keyCode;return"charCode"in l?(l=l.charCode,l===0&&t===13&&(l=13)):l=t,l===10&&(l=13),32<=l||l===13?l:0}function ju(){return!0}function Lf(){return!1}function Il(l){function t(e,a,u,n,c){this._reactName=e,this._targetInst=u,this.type=a,this.nativeEvent=n,this.target=c,this.currentTarget=null;for(var i in l)l.hasOwnProperty(i)&&(e=l[i],this[i]=e?e(n):n[i]);return this.isDefaultPrevented=(n.defaultPrevented!=null?n.defaultPrevented:n.returnValue===!1)?ju:Lf,this.isPropagationStopped=Lf,this}return U(t.prototype,{preventDefault:function(){this.defaultPrevented=!0;var e=this.nativeEvent;e&&(e.preventDefault?e.preventDefault():typeof e.returnValue!="unknown"&&(e.returnValue=!1),this.isDefaultPrevented=ju)},stopPropagation:function(){var e=this.nativeEvent;e&&(e.stopPropagation?e.stopPropagation():typeof e.cancelBubble!="unknown"&&(e.cancelBubble=!0),this.isPropagationStopped=ju)},persist:function(){},isPersistent:ju}),t}var Oe={eventPhase:0,bubbles:0,cancelable:0,timeStamp:function(l){return l.timeStamp||Date.now()},defaultPrevented:0,isTrusted:0},Hu=Il(Oe),ja=U({},Oe,{view:0,detail:0}),am=Il(ja),sc,dc,Ha,Bu=U({},ja,{screenX:0,screenY:0,clientX:0,clientY:0,pageX:0,pageY:0,ctrlKey:0,shiftKey:0,altKey:0,metaKey:0,getModifierState:mc,button:0,buttons:0,relatedTarget:function(l){return l.relatedTarget===void 0?l.fromElement===l.srcElement?l.toElement:l.fromElement:l.relatedTarget},movementX:function(l){return"movementX"in l?l.movementX:(l!==Ha&&(Ha&&l.type==="mousemove"?(sc=l.screenX-Ha.screenX,dc=l.screenY-Ha.screenY):dc=sc=0,Ha=l),sc)},movementY:function(l){return"movementY"in l?l.movementY:dc}}),Kf=Il(Bu),um=U({},Bu,{dataTransfer:0}),nm=Il(um),cm=U({},ja,{relatedTarget:0}),oc=Il(cm),im=U({},Oe,{animationName:0,elapsedTime:0,pseudoElement:0}),fm=Il(im),sm=U({},Oe,{clipboardData:function(l){return"clipboardData"in l?l.clipboardData:window.clipboardData}}),dm=Il(sm),om=U({},Oe,{data:0}),Jf=Il(om),mm={Esc:"Escape",Spacebar:" ",Left:"ArrowLeft",Up:"ArrowUp",Right:"ArrowRight",Down:"ArrowDown",Del:"Delete",Win:"OS",Menu:"ContextMenu",Apps:"ContextMenu",Scroll:"ScrollLock",MozPrintableKey:"Unidentified"},ym={8:"Backspace",9:"Tab",12:"Clear",13:"Enter",16:"Shift",17:"Control",18:"Alt",19:"Pause",20:"CapsLock",27:"Escape",32:" ",33:"PageUp",34:"PageDown",35:"End",36:"Home",37:"ArrowLeft",38:"ArrowUp",39:"ArrowRight",40:"ArrowDown",45:"Insert",46:"Delete",112:"F1",113:"F2",114:"F3",115:"F4",116:"F5",117:"F6",118:"F7",119:"F8",120:"F9",121:"F10",122:"F11",123:"F12",144:"NumLock",145:"ScrollLock",224:"Meta"},rm={Alt:"altKey",Control:"ctrlKey",Meta:"metaKey",Shift:"shiftKey"};function hm(l){var t=this.nativeEvent;return t.getModifierState?t.getModifierState(l):(l=rm[l])?!!t[l]:!1}function mc(){return hm}var vm=U({},ja,{key:function(l){if(l.key){var t=mm[l.key]||l.key;if(t!=="Unidentified")return t}return l.type==="keypress"?(l=Ru(l),l===13?"Enter":String.fromCharCode(l)):l.type==="keydown"||l.type==="keyup"?ym[l.keyCode]||"Unidentified":""},code:0,location:0,ctrlKey:0,shiftKey:0,altKey:0,metaKey:0,repeat:0,locale:0,getModifierState:mc,charCode:function(l){return l.type==="keypress"?Ru(l):0},keyCode:function(l){return l.type==="keydown"||l.type==="keyup"?l.keyCode:0},which:function(l){return l.type==="keypress"?Ru(l):l.type==="keydown"||l.type==="keyup"?l.keyCode:0}}),gm=Il(vm),Sm=U({},Bu,{pointerId:0,width:0,height:0,pressure:0,tangentialPressure:0,tiltX:0,tiltY:0,twist:0,pointerType:0,isPrimary:0}),wf=Il(Sm),bm=U({},ja,{touches:0,targetTouches:0,changedTouches:0,altKey:0,metaKey:0,ctrlKey:0,shiftKey:0,getModifierState:mc}),zm=Il(bm),Em=U({},Oe,{propertyName:0,elapsedTime:0,pseudoElement:0}),Tm=Il(Em),Am=U({},Bu,{deltaX:function(l){return"deltaX"in l?l.deltaX:"wheelDeltaX"in l?-l.wheelDeltaX:0},deltaY:function(l){return"deltaY"in l?l.deltaY:"wheelDeltaY"in l?-l.wheelDeltaY:"wheelDelta"in l?-l.wheelDelta:0},deltaZ:0,deltaMode:0}),pm=Il(Am),_m=U({},Oe,{newState:0,oldState:0}),Om=Il(_m),Mm=[9,13,27,32],yc=Bt&&"CompositionEvent"in window,Ba=null;Bt&&"documentMode"in document&&(Ba=document.documentMode);var xm=Bt&&"TextEvent"in window&&!Ba,Wf=Bt&&(!yc||Ba&&8<Ba&&11>=Ba),$f=" ",kf=!1;function Ff(l,t){switch(l){case"keyup":return Mm.indexOf(t.keyCode)!==-1;case"keydown":return t.keyCode!==229;case"keypress":case"mousedown":case"focusout":return!0;default:return!1}}function If(l){return l=l.detail,typeof l=="object"&&"data"in l?l.data:null}var Fe=!1;function Dm(l,t){switch(l){case"compositionend":return If(t);case"keypress":return t.which!==32?null:(kf=!0,$f);case"textInput":return l=t.data,l===$f&&kf?null:l;default:return null}}function Nm(l,t){if(Fe)return l==="compositionend"||!yc&&Ff(l,t)?(l=Vf(),Cu=fc=le=null,Fe=!1,l):null;switch(l){case"paste":return null;case"keypress":if(!(t.ctrlKey||t.altKey||t.metaKey)||t.ctrlKey&&t.altKey){if(t.char&&1<t.char.length)return t.char;if(t.which)return String.fromCharCode(t.which)}return null;case"compositionend":return Wf&&t.locale!=="ko"?null:t.data;default:return null}}var Um={color:!0,date:!0,datetime:!0,"datetime-local":!0,email:!0,month:!0,number:!0,password:!0,range:!0,search:!0,tel:!0,text:!0,time:!0,url:!0,week:!0};function Pf(l){var t=l&&l.nodeName&&l.nodeName.toLowerCase();return t==="input"?!!Um[l.type]:t==="textarea"}function ls(l,t,e,a){$e?ke?ke.push(a):ke=[a]:$e=a,t=On(t,"onChange"),0<t.length&&(e=new Hu("onChange","change",null,e,a),l.push({event:e,listeners:t}))}var qa=null,Ya=null;function Cm(l){Bd(l,0)}function qu(l){var t=Ua(l);if(Hf(t))return l}function ts(l,t){if(l==="change")return t}var es=!1;if(Bt){var rc;if(Bt){var hc="oninput"in document;if(!hc){var as=document.createElement("div");as.setAttribute("oninput","return;"),hc=typeof as.oninput=="function"}rc=hc}else rc=!1;es=rc&&(!document.documentMode||9<document.documentMode)}function us(){qa&&(qa.detachEvent("onpropertychange",ns),Ya=qa=null)}function ns(l){if(l.propertyName==="value"&&qu(Ya)){var t=[];ls(t,Ya,l,nc(l)),Zf(Cm,t)}}function Rm(l,t,e){l==="focusin"?(us(),qa=t,Ya=e,qa.attachEvent("onpropertychange",ns)):l==="focusout"&&us()}function jm(l){if(l==="selectionchange"||l==="keyup"||l==="keydown")return qu(Ya)}function Hm(l,t){if(l==="click")return qu(t)}function Bm(l,t){if(l==="input"||l==="change")return qu(t)}function qm(l,t){return l===t&&(l!==0||1/l===1/t)||l!==l&&t!==t}var ft=typeof Object.is=="function"?Object.is:qm;function Ga(l,t){if(ft(l,t))return!0;if(typeof l!="object"||l===null||typeof t!="object"||t===null)return!1;var e=Object.keys(l),a=Object.keys(t);if(e.length!==a.length)return!1;for(a=0;a<e.length;a++){var u=e[a];if(!Jn.call(t,u)||!ft(l[u],t[u]))return!1}return!0}function cs(l){for(;l&&l.firstChild;)l=l.firstChild;return l}function is(l,t){var e=cs(l);l=0;for(var a;e;){if(e.nodeType===3){if(a=l+e.textContent.length,l<=t&&a>=t)return{node:e,offset:t-l};l=a}l:{for(;e;){if(e.nextSibling){e=e.nextSibling;break l}e=e.parentNode}e=void 0}e=cs(e)}}function fs(l,t){return l&&t?l===t?!0:l&&l.nodeType===3?!1:t&&t.nodeType===3?fs(l,t.parentNode):"contains"in l?l.contains(t):l.compareDocumentPosition?!!(l.compareDocumentPosition(t)&16):!1:!1}function ss(l){l=l!=null&&l.ownerDocument!=null&&l.ownerDocument.defaultView!=null?l.ownerDocument.defaultView:window;for(var t=Nu(l.document);t instanceof l.HTMLIFrameElement;){try{var e=typeof t.contentWindow.location.href=="string"}catch{e=!1}if(e)l=t.contentWindow;else break;t=Nu(l.document)}return t}function vc(l){var t=l&&l.nodeName&&l.nodeName.toLowerCase();return t&&(t==="input"&&(l.type==="text"||l.type==="search"||l.type==="tel"||l.type==="url"||l.type==="password")||t==="textarea"||l.contentEditable==="true")}var Ym=Bt&&"documentMode"in document&&11>=document.documentMode,Ie=null,gc=null,Qa=null,Sc=!1;function ds(l,t,e){var a=e.window===e?e.document:e.nodeType===9?e:e.ownerDocument;Sc||Ie==null||Ie!==Nu(a)||(a=Ie,"selectionStart"in a&&vc(a)?a={start:a.selectionStart,end:a.selectionEnd}:(a=(a.ownerDocument&&a.ownerDocument.defaultView||window).getSelection(),a={anchorNode:a.anchorNode,anchorOffset:a.anchorOffset,focusNode:a.focusNode,focusOffset:a.focusOffset}),Qa&&Ga(Qa,a)||(Qa=a,a=On(gc,"onSelect"),0<a.length&&(t=new Hu("onSelect","select",null,t,e),l.push({event:t,listeners:a}),t.target=Ie)))}function Me(l,t){var e={};return e[l.toLowerCase()]=t.toLowerCase(),e["Webkit"+l]="webkit"+t,e["Moz"+l]="moz"+t,e}var Pe={animationend:Me("Animation","AnimationEnd"),animationiteration:Me("Animation","AnimationIteration"),animationstart:Me("Animation","AnimationStart"),transitionrun:Me("Transition","TransitionRun"),transitionstart:Me("Transition","TransitionStart"),transitioncancel:Me("Transition","TransitionCancel"),transitionend:Me("Transition","TransitionEnd")},bc={},os={};Bt&&(os=document.createElement("div").style,"AnimationEvent"in window||(delete Pe.animationend.animation,delete Pe.animationiteration.animation,delete Pe.animationstart.animation),"TransitionEvent"in window||delete Pe.transitionend.transition);function xe(l){if(bc[l])return bc[l];if(!Pe[l])return l;var t=Pe[l],e;for(e in t)if(t.hasOwnProperty(e)&&e in os)return bc[l]=t[e];return l}var ms=xe("animationend"),ys=xe("animationiteration"),rs=xe("animationstart"),Gm=xe("transitionrun"),Qm=xe("transitionstart"),Xm=xe("transitioncancel"),hs=xe("transitionend"),vs=new Map,zc="abort auxClick beforeToggle cancel canPlay canPlayThrough click close contextMenu copy cut drag dragEnd dragEnter dragExit dragLeave dragOver dragStart drop durationChange emptied encrypted ended error gotPointerCapture input invalid keyDown keyPress keyUp load loadedData loadedMetadata loadStart lostPointerCapture mouseDown mouseMove mouseOut mouseOver mouseUp paste pause play playing pointerCancel pointerDown pointerMove pointerOut pointerOver pointerUp progress rateChange reset resize seeked seeking stalled submit suspend timeUpdate touchCancel touchEnd touchStart volumeChange scroll toggle touchMove waiting wheel".split(" ");zc.push("scrollEnd");function Ot(l,t){vs.set(l,t),_e(t,[l])}var Yu=typeof reportError=="function"?reportError:function(l){if(typeof window=="object"&&typeof window.ErrorEvent=="function"){var t=new window.ErrorEvent("error",{bubbles:!0,cancelable:!0,message:typeof l=="object"&&l!==null&&typeof l.message=="string"?String(l.message):String(l),error:l});if(!window.dispatchEvent(t))return}else if(typeof process=="object"&&typeof process.emit=="function"){process.emit("uncaughtException",l);return}console.error(l)},gt=[],la=0,Ec=0;function Gu(){for(var l=la,t=Ec=la=0;t<l;){var e=gt[t];gt[t++]=null;var a=gt[t];gt[t++]=null;var u=gt[t];gt[t++]=null;var n=gt[t];if(gt[t++]=null,a!==null&&u!==null){var c=a.pending;c===null?u.next=u:(u.next=c.next,c.next=u),a.pending=u}n!==0&&gs(e,u,n)}}function Qu(l,t,e,a){gt[la++]=l,gt[la++]=t,gt[la++]=e,gt[la++]=a,Ec|=a,l.lanes|=a,l=l.alternate,l!==null&&(l.lanes|=a)}function Tc(l,t,e,a){return Qu(l,t,e,a),Xu(l)}function De(l,t){return Qu(l,null,null,t),Xu(l)}function gs(l,t,e){l.lanes|=e;var a=l.alternate;a!==null&&(a.lanes|=e);for(var u=!1,n=l.return;n!==null;)n.childLanes|=e,a=n.alternate,a!==null&&(a.childLanes|=e),n.tag===22&&(l=n.stateNode,l===null||l._visibility&1||(u=!0)),l=n,n=n.return;return l.tag===3?(n=l.stateNode,u&&t!==null&&(u=31-it(e),l=n.hiddenUpdates,a=l[u],a===null?l[u]=[t]:a.push(t),t.lane=e|536870912),n):null}function Xu(l){if(50<fu)throw fu=0,Ui=null,Error(o(185));for(var t=l.return;t!==null;)l=t,t=l.return;return l.tag===3?l.stateNode:null}var ta={};function Zm(l,t,e,a){this.tag=l,this.key=e,this.sibling=this.child=this.return=this.stateNode=this.type=this.elementType=null,this.index=0,this.refCleanup=this.ref=null,this.pendingProps=t,this.dependencies=this.memoizedState=this.updateQueue=this.memoizedProps=null,this.mode=a,this.subtreeFlags=this.flags=0,this.deletions=null,this.childLanes=this.lanes=0,this.alternate=null}function st(l,t,e,a){return new Zm(l,t,e,a)}function Ac(l){return l=l.prototype,!(!l||!l.isReactComponent)}function qt(l,t){var e=l.alternate;return e===null?(e=st(l.tag,t,l.key,l.mode),e.elementType=l.elementType,e.type=l.type,e.stateNode=l.stateNode,e.alternate=l,l.alternate=e):(e.pendingProps=t,e.type=l.type,e.flags=0,e.subtreeFlags=0,e.deletions=null),e.flags=l.flags&65011712,e.childLanes=l.childLanes,e.lanes=l.lanes,e.child=l.child,e.memoizedProps=l.memoizedProps,e.memoizedState=l.memoizedState,e.updateQueue=l.updateQueue,t=l.dependencies,e.dependencies=t===null?null:{lanes:t.lanes,firstContext:t.firstContext},e.sibling=l.sibling,e.index=l.index,e.ref=l.ref,e.refCleanup=l.refCleanup,e}function Ss(l,t){l.flags&=65011714;var e=l.alternate;return e===null?(l.childLanes=0,l.lanes=t,l.child=null,l.subtreeFlags=0,l.memoizedProps=null,l.memoizedState=null,l.updateQueue=null,l.dependencies=null,l.stateNode=null):(l.childLanes=e.childLanes,l.lanes=e.lanes,l.child=e.child,l.subtreeFlags=0,l.deletions=null,l.memoizedProps=e.memoizedProps,l.memoizedState=e.memoizedState,l.updateQueue=e.updateQueue,l.type=e.type,t=e.dependencies,l.dependencies=t===null?null:{lanes:t.lanes,firstContext:t.firstContext}),l}function Zu(l,t,e,a,u,n){var c=0;if(a=l,typeof l=="function")Ac(l)&&(c=1);else if(typeof l=="string")c=wy(l,e,N.current)?26:l==="html"||l==="head"||l==="body"?27:5;else l:switch(l){case Kl:return l=st(31,e,t,u),l.elementType=Kl,l.lanes=n,l;case vl:return Ne(e.children,u,n,t);case ul:c=8,u|=24;break;case zl:return l=st(12,e,t,u|2),l.elementType=zl,l.lanes=n,l;case Hl:return l=st(13,e,t,u),l.elementType=Hl,l.lanes=n,l;case Tl:return l=st(19,e,t,u),l.elementType=Tl,l.lanes=n,l;default:if(typeof l=="object"&&l!==null)switch(l.$$typeof){case nl:c=10;break l;case $:c=9;break l;case al:c=11;break l;case W:c=14;break l;case Bl:c=16,a=null;break l}c=29,e=Error(o(130,l===null?"null":typeof l,"")),a=null}return t=st(c,e,t,u),t.elementType=l,t.type=a,t.lanes=n,t}function Ne(l,t,e,a){return l=st(7,l,a,t),l.lanes=e,l}function pc(l,t,e){return l=st(6,l,null,t),l.lanes=e,l}function bs(l){var t=st(18,null,null,0);return t.stateNode=l,t}function _c(l,t,e){return t=st(4,l.children!==null?l.children:[],l.key,t),t.lanes=e,t.stateNode={containerInfo:l.containerInfo,pendingChildren:null,implementation:l.implementation},t}var zs=new WeakMap;function St(l,t){if(typeof l=="object"&&l!==null){var e=zs.get(l);return e!==void 0?e:(t={value:l,source:t,stack:bf(t)},zs.set(l,t),t)}return{value:l,source:t,stack:bf(t)}}var ea=[],aa=0,Vu=null,Xa=0,bt=[],zt=0,te=null,Dt=1,Nt="";function Yt(l,t){ea[aa++]=Xa,ea[aa++]=Vu,Vu=l,Xa=t}function Es(l,t,e){bt[zt++]=Dt,bt[zt++]=Nt,bt[zt++]=te,te=l;var a=Dt;l=Nt;var u=32-it(a)-1;a&=~(1<<u),e+=1;var n=32-it(t)+u;if(30<n){var c=u-u%5;n=(a&(1<<c)-1).toString(32),a>>=c,u-=c,Dt=1<<32-it(t)+u|e<<u|a,Nt=n+l}else Dt=1<<n|e<<u|a,Nt=l}function Oc(l){l.return!==null&&(Yt(l,1),Es(l,1,0))}function Mc(l){for(;l===Vu;)Vu=ea[--aa],ea[aa]=null,Xa=ea[--aa],ea[aa]=null;for(;l===te;)te=bt[--zt],bt[zt]=null,Nt=bt[--zt],bt[zt]=null,Dt=bt[--zt],bt[zt]=null}function Ts(l,t){bt[zt++]=Dt,bt[zt++]=Nt,bt[zt++]=te,Dt=t.id,Nt=t.overflow,te=l}var Xl=null,Al=null,el=!1,ee=null,Et=!1,xc=Error(o(519));function ae(l){var t=Error(o(418,1<arguments.length&&arguments[1]!==void 0&&arguments[1]?"text":"HTML",""));throw Za(St(t,l)),xc}function As(l){var t=l.stateNode,e=l.type,a=l.memoizedProps;switch(t[Ql]=l,t[Fl]=a,e){case"dialog":I("cancel",t),I("close",t);break;case"iframe":case"object":case"embed":I("load",t);break;case"video":case"audio":for(e=0;e<du.length;e++)I(du[e],t);break;case"source":I("error",t);break;case"img":case"image":case"link":I("error",t),I("load",t);break;case"details":I("toggle",t);break;case"input":I("invalid",t),Bf(t,a.value,a.defaultValue,a.checked,a.defaultChecked,a.type,a.name,!0);break;case"select":I("invalid",t);break;case"textarea":I("invalid",t),Yf(t,a.value,a.defaultValue,a.children)}e=a.children,typeof e!="string"&&typeof e!="number"&&typeof e!="bigint"||t.textContent===""+e||a.suppressHydrationWarning===!0||Qd(t.textContent,e)?(a.popover!=null&&(I("beforetoggle",t),I("toggle",t)),a.onScroll!=null&&I("scroll",t),a.onScrollEnd!=null&&I("scrollend",t),a.onClick!=null&&(t.onclick=Ht),t=!0):t=!1,t||ae(l,!0)}function ps(l){for(Xl=l.return;Xl;)switch(Xl.tag){case 5:case 31:case 13:Et=!1;return;case 27:case 3:Et=!0;return;default:Xl=Xl.return}}function ua(l){if(l!==Xl)return!1;if(!el)return ps(l),el=!0,!1;var t=l.tag,e;if((e=t!==3&&t!==27)&&((e=t===5)&&(e=l.type,e=!(e!=="form"&&e!=="button")||Ji(l.type,l.memoizedProps)),e=!e),e&&Al&&ae(l),ps(l),t===13){if(l=l.memoizedState,l=l!==null?l.dehydrated:null,!l)throw Error(o(317));Al=$d(l)}else if(t===31){if(l=l.memoizedState,l=l!==null?l.dehydrated:null,!l)throw Error(o(317));Al=$d(l)}else t===27?(t=Al,ge(l.type)?(l=Fi,Fi=null,Al=l):Al=t):Al=Xl?At(l.stateNode.nextSibling):null;return!0}function Ue(){Al=Xl=null,el=!1}function Dc(){var l=ee;return l!==null&&(et===null?et=l:et.push.apply(et,l),ee=null),l}function Za(l){ee===null?ee=[l]:ee.push(l)}var Nc=d(null),Ce=null,Gt=null;function ue(l,t,e){O(Nc,t._currentValue),t._currentValue=e}function Qt(l){l._currentValue=Nc.current,g(Nc)}function Uc(l,t,e){for(;l!==null;){var a=l.alternate;if((l.childLanes&t)!==t?(l.childLanes|=t,a!==null&&(a.childLanes|=t)):a!==null&&(a.childLanes&t)!==t&&(a.childLanes|=t),l===e)break;l=l.return}}function Cc(l,t,e,a){var u=l.child;for(u!==null&&(u.return=l);u!==null;){var n=u.dependencies;if(n!==null){var c=u.child;n=n.firstContext;l:for(;n!==null;){var i=n;n=u;for(var f=0;f<t.length;f++)if(i.context===t[f]){n.lanes|=e,i=n.alternate,i!==null&&(i.lanes|=e),Uc(n.return,e,l),a||(c=null);break l}n=i.next}}else if(u.tag===18){if(c=u.return,c===null)throw Error(o(341));c.lanes|=e,n=c.alternate,n!==null&&(n.lanes|=e),Uc(c,e,l),c=null}else c=u.child;if(c!==null)c.return=u;else for(c=u;c!==null;){if(c===l){c=null;break}if(u=c.sibling,u!==null){u.return=c.return,c=u;break}c=c.return}u=c}}function na(l,t,e,a){l=null;for(var u=t,n=!1;u!==null;){if(!n){if((u.flags&524288)!==0)n=!0;else if((u.flags&262144)!==0)break}if(u.tag===10){var c=u.alternate;if(c===null)throw Error(o(387));if(c=c.memoizedProps,c!==null){var i=u.type;ft(u.pendingProps.value,c.value)||(l!==null?l.push(i):l=[i])}}else if(u===ol.current){if(c=u.alternate,c===null)throw Error(o(387));c.memoizedState.memoizedState!==u.memoizedState.memoizedState&&(l!==null?l.push(hu):l=[hu])}u=u.return}l!==null&&Cc(t,l,e,a),t.flags|=262144}function Lu(l){for(l=l.firstContext;l!==null;){if(!ft(l.context._currentValue,l.memoizedValue))return!0;l=l.next}return!1}function Re(l){Ce=l,Gt=null,l=l.dependencies,l!==null&&(l.firstContext=null)}function Zl(l){return _s(Ce,l)}function Ku(l,t){return Ce===null&&Re(l),_s(l,t)}function _s(l,t){var e=t._currentValue;if(t={context:t,memoizedValue:e,next:null},Gt===null){if(l===null)throw Error(o(308));Gt=t,l.dependencies={lanes:0,firstContext:t},l.flags|=524288}else Gt=Gt.next=t;return e}var Vm=typeof AbortController<"u"?AbortController:function(){var l=[],t=this.signal={aborted:!1,addEventListener:function(e,a){l.push(a)}};this.abort=function(){t.aborted=!0,l.forEach(function(e){return e()})}},Lm=h.unstable_scheduleCallback,Km=h.unstable_NormalPriority,Ul={$$typeof:nl,Consumer:null,Provider:null,_currentValue:null,_currentValue2:null,_threadCount:0};function Rc(){return{controller:new Vm,data:new Map,refCount:0}}function Va(l){l.refCount--,l.refCount===0&&Lm(Km,function(){l.controller.abort()})}var La=null,jc=0,ca=0,ia=null;function Jm(l,t){if(La===null){var e=La=[];jc=0,ca=qi(),ia={status:"pending",value:void 0,then:function(a){e.push(a)}}}return jc++,t.then(Os,Os),t}function Os(){if(--jc===0&&La!==null){ia!==null&&(ia.status="fulfilled");var l=La;La=null,ca=0,ia=null;for(var t=0;t<l.length;t++)(0,l[t])()}}function wm(l,t){var e=[],a={status:"pending",value:null,reason:null,then:function(u){e.push(u)}};return l.then(function(){a.status="fulfilled",a.value=t;for(var u=0;u<e.length;u++)(0,e[u])(t)},function(u){for(a.status="rejected",a.reason=u,u=0;u<e.length;u++)(0,e[u])(void 0)}),a}var Ms=z.S;z.S=function(l,t){dd=nt(),typeof t=="object"&&t!==null&&typeof t.then=="function"&&Jm(l,t),Ms!==null&&Ms(l,t)};var je=d(null);function Hc(){var l=je.current;return l!==null?l:El.pooledCache}function Ju(l,t){t===null?O(je,je.current):O(je,t.pool)}function xs(){var l=Hc();return l===null?null:{parent:Ul._currentValue,pool:l}}var fa=Error(o(460)),Bc=Error(o(474)),wu=Error(o(542)),Wu={then:function(){}};function Ds(l){return l=l.status,l==="fulfilled"||l==="rejected"}function Ns(l,t,e){switch(e=l[e],e===void 0?l.push(t):e!==t&&(t.then(Ht,Ht),t=e),t.status){case"fulfilled":return t.value;case"rejected":throw l=t.reason,Cs(l),l;default:if(typeof t.status=="string")t.then(Ht,Ht);else{if(l=El,l!==null&&100<l.shellSuspendCounter)throw Error(o(482));l=t,l.status="pending",l.then(function(a){if(t.status==="pending"){var u=t;u.status="fulfilled",u.value=a}},function(a){if(t.status==="pending"){var u=t;u.status="rejected",u.reason=a}})}switch(t.status){case"fulfilled":return t.value;case"rejected":throw l=t.reason,Cs(l),l}throw Be=t,fa}}function He(l){try{var t=l._init;return t(l._payload)}catch(e){throw e!==null&&typeof e=="object"&&typeof e.then=="function"?(Be=e,fa):e}}var Be=null;function Us(){if(Be===null)throw Error(o(459));var l=Be;return Be=null,l}function Cs(l){if(l===fa||l===wu)throw Error(o(483))}var sa=null,Ka=0;function $u(l){var t=Ka;return Ka+=1,sa===null&&(sa=[]),Ns(sa,l,t)}function Ja(l,t){t=t.props.ref,l.ref=t!==void 0?t:null}function ku(l,t){throw t.$$typeof===x?Error(o(525)):(l=Object.prototype.toString.call(t),Error(o(31,l==="[object Object]"?"object with keys {"+Object.keys(t).join(", ")+"}":l)))}function Rs(l){function t(m,s){if(l){var y=m.deletions;y===null?(m.deletions=[s],m.flags|=16):y.push(s)}}function e(m,s){if(!l)return null;for(;s!==null;)t(m,s),s=s.sibling;return null}function a(m){for(var s=new Map;m!==null;)m.key!==null?s.set(m.key,m):s.set(m.index,m),m=m.sibling;return s}function u(m,s){return m=qt(m,s),m.index=0,m.sibling=null,m}function n(m,s,y){return m.index=y,l?(y=m.alternate,y!==null?(y=y.index,y<s?(m.flags|=67108866,s):y):(m.flags|=67108866,s)):(m.flags|=1048576,s)}function c(m){return l&&m.alternate===null&&(m.flags|=67108866),m}function i(m,s,y,T){return s===null||s.tag!==6?(s=pc(y,m.mode,T),s.return=m,s):(s=u(s,y),s.return=m,s)}function f(m,s,y,T){var Q=y.type;return Q===vl?b(m,s,y.props.children,T,y.key):s!==null&&(s.elementType===Q||typeof Q=="object"&&Q!==null&&Q.$$typeof===Bl&&He(Q)===s.type)?(s=u(s,y.props),Ja(s,y),s.return=m,s):(s=Zu(y.type,y.key,y.props,null,m.mode,T),Ja(s,y),s.return=m,s)}function r(m,s,y,T){return s===null||s.tag!==4||s.stateNode.containerInfo!==y.containerInfo||s.stateNode.implementation!==y.implementation?(s=_c(y,m.mode,T),s.return=m,s):(s=u(s,y.children||[]),s.return=m,s)}function b(m,s,y,T,Q){return s===null||s.tag!==7?(s=Ne(y,m.mode,T,Q),s.return=m,s):(s=u(s,y),s.return=m,s)}function A(m,s,y){if(typeof s=="string"&&s!==""||typeof s=="number"||typeof s=="bigint")return s=pc(""+s,m.mode,y),s.return=m,s;if(typeof s=="object"&&s!==null){switch(s.$$typeof){case P:return y=Zu(s.type,s.key,s.props,null,m.mode,y),Ja(y,s),y.return=m,y;case k:return s=_c(s,m.mode,y),s.return=m,s;case Bl:return s=He(s),A(m,s,y)}if(ut(s)||fl(s))return s=Ne(s,m.mode,y,null),s.return=m,s;if(typeof s.then=="function")return A(m,$u(s),y);if(s.$$typeof===nl)return A(m,Ku(m,s),y);ku(m,s)}return null}function v(m,s,y,T){var Q=s!==null?s.key:null;if(typeof y=="string"&&y!==""||typeof y=="number"||typeof y=="bigint")return Q!==null?null:i(m,s,""+y,T);if(typeof y=="object"&&y!==null){switch(y.$$typeof){case P:return y.key===Q?f(m,s,y,T):null;case k:return y.key===Q?r(m,s,y,T):null;case Bl:return y=He(y),v(m,s,y,T)}if(ut(y)||fl(y))return Q!==null?null:b(m,s,y,T,null);if(typeof y.then=="function")return v(m,s,$u(y),T);if(y.$$typeof===nl)return v(m,s,Ku(m,y),T);ku(m,y)}return null}function S(m,s,y,T,Q){if(typeof T=="string"&&T!==""||typeof T=="number"||typeof T=="bigint")return m=m.get(y)||null,i(s,m,""+T,Q);if(typeof T=="object"&&T!==null){switch(T.$$typeof){case P:return m=m.get(T.key===null?y:T.key)||null,f(s,m,T,Q);case k:return m=m.get(T.key===null?y:T.key)||null,r(s,m,T,Q);case Bl:return T=He(T),S(m,s,y,T,Q)}if(ut(T)||fl(T))return m=m.get(y)||null,b(s,m,T,Q,null);if(typeof T.then=="function")return S(m,s,y,$u(T),Q);if(T.$$typeof===nl)return S(m,s,y,Ku(s,T),Q);ku(s,T)}return null}function j(m,s,y,T){for(var Q=null,cl=null,B=s,w=s=0,tl=null;B!==null&&w<y.length;w++){B.index>w?(tl=B,B=null):tl=B.sibling;var il=v(m,B,y[w],T);if(il===null){B===null&&(B=tl);break}l&&B&&il.alternate===null&&t(m,B),s=n(il,s,w),cl===null?Q=il:cl.sibling=il,cl=il,B=tl}if(w===y.length)return e(m,B),el&&Yt(m,w),Q;if(B===null){for(;w<y.length;w++)B=A(m,y[w],T),B!==null&&(s=n(B,s,w),cl===null?Q=B:cl.sibling=B,cl=B);return el&&Yt(m,w),Q}for(B=a(B);w<y.length;w++)tl=S(B,m,w,y[w],T),tl!==null&&(l&&tl.alternate!==null&&B.delete(tl.key===null?w:tl.key),s=n(tl,s,w),cl===null?Q=tl:cl.sibling=tl,cl=tl);return l&&B.forEach(function(Te){return t(m,Te)}),el&&Yt(m,w),Q}function Z(m,s,y,T){if(y==null)throw Error(o(151));for(var Q=null,cl=null,B=s,w=s=0,tl=null,il=y.next();B!==null&&!il.done;w++,il=y.next()){B.index>w?(tl=B,B=null):tl=B.sibling;var Te=v(m,B,il.value,T);if(Te===null){B===null&&(B=tl);break}l&&B&&Te.alternate===null&&t(m,B),s=n(Te,s,w),cl===null?Q=Te:cl.sibling=Te,cl=Te,B=tl}if(il.done)return e(m,B),el&&Yt(m,w),Q;if(B===null){for(;!il.done;w++,il=y.next())il=A(m,il.value,T),il!==null&&(s=n(il,s,w),cl===null?Q=il:cl.sibling=il,cl=il);return el&&Yt(m,w),Q}for(B=a(B);!il.done;w++,il=y.next())il=S(B,m,w,il.value,T),il!==null&&(l&&il.alternate!==null&&B.delete(il.key===null?w:il.key),s=n(il,s,w),cl===null?Q=il:cl.sibling=il,cl=il);return l&&B.forEach(function(ur){return t(m,ur)}),el&&Yt(m,w),Q}function bl(m,s,y,T){if(typeof y=="object"&&y!==null&&y.type===vl&&y.key===null&&(y=y.props.children),typeof y=="object"&&y!==null){switch(y.$$typeof){case P:l:{for(var Q=y.key;s!==null;){if(s.key===Q){if(Q=y.type,Q===vl){if(s.tag===7){e(m,s.sibling),T=u(s,y.props.children),T.return=m,m=T;break l}}else if(s.elementType===Q||typeof Q=="object"&&Q!==null&&Q.$$typeof===Bl&&He(Q)===s.type){e(m,s.sibling),T=u(s,y.props),Ja(T,y),T.return=m,m=T;break l}e(m,s);break}else t(m,s);s=s.sibling}y.type===vl?(T=Ne(y.props.children,m.mode,T,y.key),T.return=m,m=T):(T=Zu(y.type,y.key,y.props,null,m.mode,T),Ja(T,y),T.return=m,m=T)}return c(m);case k:l:{for(Q=y.key;s!==null;){if(s.key===Q)if(s.tag===4&&s.stateNode.containerInfo===y.containerInfo&&s.stateNode.implementation===y.implementation){e(m,s.sibling),T=u(s,y.children||[]),T.return=m,m=T;break l}else{e(m,s);break}else t(m,s);s=s.sibling}T=_c(y,m.mode,T),T.return=m,m=T}return c(m);case Bl:return y=He(y),bl(m,s,y,T)}if(ut(y))return j(m,s,y,T);if(fl(y)){if(Q=fl(y),typeof Q!="function")throw Error(o(150));return y=Q.call(y),Z(m,s,y,T)}if(typeof y.then=="function")return bl(m,s,$u(y),T);if(y.$$typeof===nl)return bl(m,s,Ku(m,y),T);ku(m,y)}return typeof y=="string"&&y!==""||typeof y=="number"||typeof y=="bigint"?(y=""+y,s!==null&&s.tag===6?(e(m,s.sibling),T=u(s,y),T.return=m,m=T):(e(m,s),T=pc(y,m.mode,T),T.return=m,m=T),c(m)):e(m,s)}return function(m,s,y,T){try{Ka=0;var Q=bl(m,s,y,T);return sa=null,Q}catch(B){if(B===fa||B===wu)throw B;var cl=st(29,B,null,m.mode);return cl.lanes=T,cl.return=m,cl}}}var qe=Rs(!0),js=Rs(!1),ne=!1;function qc(l){l.updateQueue={baseState:l.memoizedState,firstBaseUpdate:null,lastBaseUpdate:null,shared:{pending:null,lanes:0,hiddenCallbacks:null},callbacks:null}}function Yc(l,t){l=l.updateQueue,t.updateQueue===l&&(t.updateQueue={baseState:l.baseState,firstBaseUpdate:l.firstBaseUpdate,lastBaseUpdate:l.lastBaseUpdate,shared:l.shared,callbacks:null})}function ce(l){return{lane:l,tag:0,payload:null,callback:null,next:null}}function ie(l,t,e){var a=l.updateQueue;if(a===null)return null;if(a=a.shared,(sl&2)!==0){var u=a.pending;return u===null?t.next=t:(t.next=u.next,u.next=t),a.pending=t,t=Xu(l),gs(l,null,e),t}return Qu(l,a,t,e),Xu(l)}function wa(l,t,e){if(t=t.updateQueue,t!==null&&(t=t.shared,(e&4194048)!==0)){var a=t.lanes;a&=l.pendingLanes,e|=a,t.lanes=e,_f(l,e)}}function Gc(l,t){var e=l.updateQueue,a=l.alternate;if(a!==null&&(a=a.updateQueue,e===a)){var u=null,n=null;if(e=e.firstBaseUpdate,e!==null){do{var c={lane:e.lane,tag:e.tag,payload:e.payload,callback:null,next:null};n===null?u=n=c:n=n.next=c,e=e.next}while(e!==null);n===null?u=n=t:n=n.next=t}else u=n=t;e={baseState:a.baseState,firstBaseUpdate:u,lastBaseUpdate:n,shared:a.shared,callbacks:a.callbacks},l.updateQueue=e;return}l=e.lastBaseUpdate,l===null?e.firstBaseUpdate=t:l.next=t,e.lastBaseUpdate=t}var Qc=!1;function Wa(){if(Qc){var l=ia;if(l!==null)throw l}}function $a(l,t,e,a){Qc=!1;var u=l.updateQueue;ne=!1;var n=u.firstBaseUpdate,c=u.lastBaseUpdate,i=u.shared.pending;if(i!==null){u.shared.pending=null;var f=i,r=f.next;f.next=null,c===null?n=r:c.next=r,c=f;var b=l.alternate;b!==null&&(b=b.updateQueue,i=b.lastBaseUpdate,i!==c&&(i===null?b.firstBaseUpdate=r:i.next=r,b.lastBaseUpdate=f))}if(n!==null){var A=u.baseState;c=0,b=r=f=null,i=n;do{var v=i.lane&-536870913,S=v!==i.lane;if(S?(ll&v)===v:(a&v)===v){v!==0&&v===ca&&(Qc=!0),b!==null&&(b=b.next={lane:0,tag:i.tag,payload:i.payload,callback:null,next:null});l:{var j=l,Z=i;v=t;var bl=e;switch(Z.tag){case 1:if(j=Z.payload,typeof j=="function"){A=j.call(bl,A,v);break l}A=j;break l;case 3:j.flags=j.flags&-65537|128;case 0:if(j=Z.payload,v=typeof j=="function"?j.call(bl,A,v):j,v==null)break l;A=U({},A,v);break l;case 2:ne=!0}}v=i.callback,v!==null&&(l.flags|=64,S&&(l.flags|=8192),S=u.callbacks,S===null?u.callbacks=[v]:S.push(v))}else S={lane:v,tag:i.tag,payload:i.payload,callback:i.callback,next:null},b===null?(r=b=S,f=A):b=b.next=S,c|=v;if(i=i.next,i===null){if(i=u.shared.pending,i===null)break;S=i,i=S.next,S.next=null,u.lastBaseUpdate=S,u.shared.pending=null}}while(!0);b===null&&(f=A),u.baseState=f,u.firstBaseUpdate=r,u.lastBaseUpdate=b,n===null&&(u.shared.lanes=0),me|=c,l.lanes=c,l.memoizedState=A}}function Hs(l,t){if(typeof l!="function")throw Error(o(191,l));l.call(t)}function Bs(l,t){var e=l.callbacks;if(e!==null)for(l.callbacks=null,l=0;l<e.length;l++)Hs(e[l],t)}var da=d(null),Fu=d(0);function qs(l,t){l=$t,O(Fu,l),O(da,t),$t=l|t.baseLanes}function Xc(){O(Fu,$t),O(da,da.current)}function Zc(){$t=Fu.current,g(da),g(Fu)}var dt=d(null),Tt=null;function fe(l){var t=l.alternate;O(Dl,Dl.current&1),O(dt,l),Tt===null&&(t===null||da.current!==null||t.memoizedState!==null)&&(Tt=l)}function Vc(l){O(Dl,Dl.current),O(dt,l),Tt===null&&(Tt=l)}function Ys(l){l.tag===22?(O(Dl,Dl.current),O(dt,l),Tt===null&&(Tt=l)):se()}function se(){O(Dl,Dl.current),O(dt,dt.current)}function ot(l){g(dt),Tt===l&&(Tt=null),g(Dl)}var Dl=d(0);function Iu(l){for(var t=l;t!==null;){if(t.tag===13){var e=t.memoizedState;if(e!==null&&(e=e.dehydrated,e===null||$i(e)||ki(e)))return t}else if(t.tag===19&&(t.memoizedProps.revealOrder==="forwards"||t.memoizedProps.revealOrder==="backwards"||t.memoizedProps.revealOrder==="unstable_legacy-backwards"||t.memoizedProps.revealOrder==="together")){if((t.flags&128)!==0)return t}else if(t.child!==null){t.child.return=t,t=t.child;continue}if(t===l)break;for(;t.sibling===null;){if(t.return===null||t.return===l)return null;t=t.return}t.sibling.return=t.return,t=t.sibling}return null}var Xt=0,J=null,gl=null,Cl=null,Pu=!1,oa=!1,Ye=!1,ln=0,ka=0,ma=null,Wm=0;function Ol(){throw Error(o(321))}function Lc(l,t){if(t===null)return!1;for(var e=0;e<t.length&&e<l.length;e++)if(!ft(l[e],t[e]))return!1;return!0}function Kc(l,t,e,a,u,n){return Xt=n,J=t,t.memoizedState=null,t.updateQueue=null,t.lanes=0,z.H=l===null||l.memoizedState===null?E0:ci,Ye=!1,n=e(a,u),Ye=!1,oa&&(n=Qs(t,e,a,u)),Gs(l),n}function Gs(l){z.H=Pa;var t=gl!==null&&gl.next!==null;if(Xt=0,Cl=gl=J=null,Pu=!1,ka=0,ma=null,t)throw Error(o(300));l===null||Rl||(l=l.dependencies,l!==null&&Lu(l)&&(Rl=!0))}function Qs(l,t,e,a){J=l;var u=0;do{if(oa&&(ma=null),ka=0,oa=!1,25<=u)throw Error(o(301));if(u+=1,Cl=gl=null,l.updateQueue!=null){var n=l.updateQueue;n.lastEffect=null,n.events=null,n.stores=null,n.memoCache!=null&&(n.memoCache.index=0)}z.H=T0,n=t(e,a)}while(oa);return n}function $m(){var l=z.H,t=l.useState()[0];return t=typeof t.then=="function"?Fa(t):t,l=l.useState()[0],(gl!==null?gl.memoizedState:null)!==l&&(J.flags|=1024),t}function Jc(){var l=ln!==0;return ln=0,l}function wc(l,t,e){t.updateQueue=l.updateQueue,t.flags&=-2053,l.lanes&=~e}function Wc(l){if(Pu){for(l=l.memoizedState;l!==null;){var t=l.queue;t!==null&&(t.pending=null),l=l.next}Pu=!1}Xt=0,Cl=gl=J=null,oa=!1,ka=ln=0,ma=null}function Wl(){var l={memoizedState:null,baseState:null,baseQueue:null,queue:null,next:null};return Cl===null?J.memoizedState=Cl=l:Cl=Cl.next=l,Cl}function Nl(){if(gl===null){var l=J.alternate;l=l!==null?l.memoizedState:null}else l=gl.next;var t=Cl===null?J.memoizedState:Cl.next;if(t!==null)Cl=t,gl=l;else{if(l===null)throw J.alternate===null?Error(o(467)):Error(o(310));gl=l,l={memoizedState:gl.memoizedState,baseState:gl.baseState,baseQueue:gl.baseQueue,queue:gl.queue,next:null},Cl===null?J.memoizedState=Cl=l:Cl=Cl.next=l}return Cl}function tn(){return{lastEffect:null,events:null,stores:null,memoCache:null}}function Fa(l){var t=ka;return ka+=1,ma===null&&(ma=[]),l=Ns(ma,l,t),t=J,(Cl===null?t.memoizedState:Cl.next)===null&&(t=t.alternate,z.H=t===null||t.memoizedState===null?E0:ci),l}function en(l){if(l!==null&&typeof l=="object"){if(typeof l.then=="function")return Fa(l);if(l.$$typeof===nl)return Zl(l)}throw Error(o(438,String(l)))}function $c(l){var t=null,e=J.updateQueue;if(e!==null&&(t=e.memoCache),t==null){var a=J.alternate;a!==null&&(a=a.updateQueue,a!==null&&(a=a.memoCache,a!=null&&(t={data:a.data.map(function(u){return u.slice()}),index:0})))}if(t==null&&(t={data:[],index:0}),e===null&&(e=tn(),J.updateQueue=e),e.memoCache=t,e=t.data[t.index],e===void 0)for(e=t.data[t.index]=Array(l),a=0;a<l;a++)e[a]=$l;return t.index++,e}function Zt(l,t){return typeof t=="function"?t(l):t}function an(l){var t=Nl();return kc(t,gl,l)}function kc(l,t,e){var a=l.queue;if(a===null)throw Error(o(311));a.lastRenderedReducer=e;var u=l.baseQueue,n=a.pending;if(n!==null){if(u!==null){var c=u.next;u.next=n.next,n.next=c}t.baseQueue=u=n,a.pending=null}if(n=l.baseState,u===null)l.memoizedState=n;else{t=u.next;var i=c=null,f=null,r=t,b=!1;do{var A=r.lane&-536870913;if(A!==r.lane?(ll&A)===A:(Xt&A)===A){var v=r.revertLane;if(v===0)f!==null&&(f=f.next={lane:0,revertLane:0,gesture:null,action:r.action,hasEagerState:r.hasEagerState,eagerState:r.eagerState,next:null}),A===ca&&(b=!0);else if((Xt&v)===v){r=r.next,v===ca&&(b=!0);continue}else A={lane:0,revertLane:r.revertLane,gesture:null,action:r.action,hasEagerState:r.hasEagerState,eagerState:r.eagerState,next:null},f===null?(i=f=A,c=n):f=f.next=A,J.lanes|=v,me|=v;A=r.action,Ye&&e(n,A),n=r.hasEagerState?r.eagerState:e(n,A)}else v={lane:A,revertLane:r.revertLane,gesture:r.gesture,action:r.action,hasEagerState:r.hasEagerState,eagerState:r.eagerState,next:null},f===null?(i=f=v,c=n):f=f.next=v,J.lanes|=A,me|=A;r=r.next}while(r!==null&&r!==t);if(f===null?c=n:f.next=i,!ft(n,l.memoizedState)&&(Rl=!0,b&&(e=ia,e!==null)))throw e;l.memoizedState=n,l.baseState=c,l.baseQueue=f,a.lastRenderedState=n}return u===null&&(a.lanes=0),[l.memoizedState,a.dispatch]}function Fc(l){var t=Nl(),e=t.queue;if(e===null)throw Error(o(311));e.lastRenderedReducer=l;var a=e.dispatch,u=e.pending,n=t.memoizedState;if(u!==null){e.pending=null;var c=u=u.next;do n=l(n,c.action),c=c.next;while(c!==u);ft(n,t.memoizedState)||(Rl=!0),t.memoizedState=n,t.baseQueue===null&&(t.baseState=n),e.lastRenderedState=n}return[n,a]}function Xs(l,t,e){var a=J,u=Nl(),n=el;if(n){if(e===void 0)throw Error(o(407));e=e()}else e=t();var c=!ft((gl||u).memoizedState,e);if(c&&(u.memoizedState=e,Rl=!0),u=u.queue,li(Ls.bind(null,a,u,l),[l]),u.getSnapshot!==t||c||Cl!==null&&Cl.memoizedState.tag&1){if(a.flags|=2048,ya(9,{destroy:void 0},Vs.bind(null,a,u,e,t),null),El===null)throw Error(o(349));n||(Xt&127)!==0||Zs(a,t,e)}return e}function Zs(l,t,e){l.flags|=16384,l={getSnapshot:t,value:e},t=J.updateQueue,t===null?(t=tn(),J.updateQueue=t,t.stores=[l]):(e=t.stores,e===null?t.stores=[l]:e.push(l))}function Vs(l,t,e,a){t.value=e,t.getSnapshot=a,Ks(t)&&Js(l)}function Ls(l,t,e){return e(function(){Ks(t)&&Js(l)})}function Ks(l){var t=l.getSnapshot;l=l.value;try{var e=t();return!ft(l,e)}catch{return!0}}function Js(l){var t=De(l,2);t!==null&&at(t,l,2)}function Ic(l){var t=Wl();if(typeof l=="function"){var e=l;if(l=e(),Ye){It(!0);try{e()}finally{It(!1)}}}return t.memoizedState=t.baseState=l,t.queue={pending:null,lanes:0,dispatch:null,lastRenderedReducer:Zt,lastRenderedState:l},t}function ws(l,t,e,a){return l.baseState=e,kc(l,gl,typeof a=="function"?a:Zt)}function km(l,t,e,a,u){if(cn(l))throw Error(o(485));if(l=t.action,l!==null){var n={payload:u,action:l,next:null,isTransition:!0,status:"pending",value:null,reason:null,listeners:[],then:function(c){n.listeners.push(c)}};z.T!==null?e(!0):n.isTransition=!1,a(n),e=t.pending,e===null?(n.next=t.pending=n,Ws(t,n)):(n.next=e.next,t.pending=e.next=n)}}function Ws(l,t){var e=t.action,a=t.payload,u=l.state;if(t.isTransition){var n=z.T,c={};z.T=c;try{var i=e(u,a),f=z.S;f!==null&&f(c,i),$s(l,t,i)}catch(r){Pc(l,t,r)}finally{n!==null&&c.types!==null&&(n.types=c.types),z.T=n}}else try{n=e(u,a),$s(l,t,n)}catch(r){Pc(l,t,r)}}function $s(l,t,e){e!==null&&typeof e=="object"&&typeof e.then=="function"?e.then(function(a){ks(l,t,a)},function(a){return Pc(l,t,a)}):ks(l,t,e)}function ks(l,t,e){t.status="fulfilled",t.value=e,Fs(t),l.state=e,t=l.pending,t!==null&&(e=t.next,e===t?l.pending=null:(e=e.next,t.next=e,Ws(l,e)))}function Pc(l,t,e){var a=l.pending;if(l.pending=null,a!==null){a=a.next;do t.status="rejected",t.reason=e,Fs(t),t=t.next;while(t!==a)}l.action=null}function Fs(l){l=l.listeners;for(var t=0;t<l.length;t++)(0,l[t])()}function Is(l,t){return t}function Ps(l,t){if(el){var e=El.formState;if(e!==null){l:{var a=J;if(el){if(Al){t:{for(var u=Al,n=Et;u.nodeType!==8;){if(!n){u=null;break t}if(u=At(u.nextSibling),u===null){u=null;break t}}n=u.data,u=n==="F!"||n==="F"?u:null}if(u){Al=At(u.nextSibling),a=u.data==="F!";break l}}ae(a)}a=!1}a&&(t=e[0])}}return e=Wl(),e.memoizedState=e.baseState=t,a={pending:null,lanes:0,dispatch:null,lastRenderedReducer:Is,lastRenderedState:t},e.queue=a,e=S0.bind(null,J,a),a.dispatch=e,a=Ic(!1),n=ni.bind(null,J,!1,a.queue),a=Wl(),u={state:t,dispatch:null,action:l,pending:null},a.queue=u,e=km.bind(null,J,u,n,e),u.dispatch=e,a.memoizedState=l,[t,e,!1]}function l0(l){var t=Nl();return t0(t,gl,l)}function t0(l,t,e){if(t=kc(l,t,Is)[0],l=an(Zt)[0],typeof t=="object"&&t!==null&&typeof t.then=="function")try{var a=Fa(t)}catch(c){throw c===fa?wu:c}else a=t;t=Nl();var u=t.queue,n=u.dispatch;return e!==t.memoizedState&&(J.flags|=2048,ya(9,{destroy:void 0},Fm.bind(null,u,e),null)),[a,n,l]}function Fm(l,t){l.action=t}function e0(l){var t=Nl(),e=gl;if(e!==null)return t0(t,e,l);Nl(),t=t.memoizedState,e=Nl();var a=e.queue.dispatch;return e.memoizedState=l,[t,a,!1]}function ya(l,t,e,a){return l={tag:l,create:e,deps:a,inst:t,next:null},t=J.updateQueue,t===null&&(t=tn(),J.updateQueue=t),e=t.lastEffect,e===null?t.lastEffect=l.next=l:(a=e.next,e.next=l,l.next=a,t.lastEffect=l),l}function a0(){return Nl().memoizedState}function un(l,t,e,a){var u=Wl();J.flags|=l,u.memoizedState=ya(1|t,{destroy:void 0},e,a===void 0?null:a)}function nn(l,t,e,a){var u=Nl();a=a===void 0?null:a;var n=u.memoizedState.inst;gl!==null&&a!==null&&Lc(a,gl.memoizedState.deps)?u.memoizedState=ya(t,n,e,a):(J.flags|=l,u.memoizedState=ya(1|t,n,e,a))}function u0(l,t){un(8390656,8,l,t)}function li(l,t){nn(2048,8,l,t)}function Im(l){J.flags|=4;var t=J.updateQueue;if(t===null)t=tn(),J.updateQueue=t,t.events=[l];else{var e=t.events;e===null?t.events=[l]:e.push(l)}}function n0(l){var t=Nl().memoizedState;return Im({ref:t,nextImpl:l}),function(){if((sl&2)!==0)throw Error(o(440));return t.impl.apply(void 0,arguments)}}function c0(l,t){return nn(4,2,l,t)}function i0(l,t){return nn(4,4,l,t)}function f0(l,t){if(typeof t=="function"){l=l();var e=t(l);return function(){typeof e=="function"?e():t(null)}}if(t!=null)return l=l(),t.current=l,function(){t.current=null}}function s0(l,t,e){e=e!=null?e.concat([l]):null,nn(4,4,f0.bind(null,t,l),e)}function ti(){}function d0(l,t){var e=Nl();t=t===void 0?null:t;var a=e.memoizedState;return t!==null&&Lc(t,a[1])?a[0]:(e.memoizedState=[l,t],l)}function o0(l,t){var e=Nl();t=t===void 0?null:t;var a=e.memoizedState;if(t!==null&&Lc(t,a[1]))return a[0];if(a=l(),Ye){It(!0);try{l()}finally{It(!1)}}return e.memoizedState=[a,t],a}function ei(l,t,e){return e===void 0||(Xt&1073741824)!==0&&(ll&261930)===0?l.memoizedState=t:(l.memoizedState=e,l=md(),J.lanes|=l,me|=l,e)}function m0(l,t,e,a){return ft(e,t)?e:da.current!==null?(l=ei(l,e,a),ft(l,t)||(Rl=!0),l):(Xt&42)===0||(Xt&1073741824)!==0&&(ll&261930)===0?(Rl=!0,l.memoizedState=e):(l=md(),J.lanes|=l,me|=l,t)}function y0(l,t,e,a,u){var n=D.p;D.p=n!==0&&8>n?n:8;var c=z.T,i={};z.T=i,ni(l,!1,t,e);try{var f=u(),r=z.S;if(r!==null&&r(i,f),f!==null&&typeof f=="object"&&typeof f.then=="function"){var b=wm(f,a);Ia(l,t,b,rt(l))}else Ia(l,t,a,rt(l))}catch(A){Ia(l,t,{then:function(){},status:"rejected",reason:A},rt())}finally{D.p=n,c!==null&&i.types!==null&&(c.types=i.types),z.T=c}}function Pm(){}function ai(l,t,e,a){if(l.tag!==5)throw Error(o(476));var u=r0(l).queue;y0(l,u,t,V,e===null?Pm:function(){return h0(l),e(a)})}function r0(l){var t=l.memoizedState;if(t!==null)return t;t={memoizedState:V,baseState:V,baseQueue:null,queue:{pending:null,lanes:0,dispatch:null,lastRenderedReducer:Zt,lastRenderedState:V},next:null};var e={};return t.next={memoizedState:e,baseState:e,baseQueue:null,queue:{pending:null,lanes:0,dispatch:null,lastRenderedReducer:Zt,lastRenderedState:e},next:null},l.memoizedState=t,l=l.alternate,l!==null&&(l.memoizedState=t),t}function h0(l){var t=r0(l);t.next===null&&(t=l.alternate.memoizedState),Ia(l,t.next.queue,{},rt())}function ui(){return Zl(hu)}function v0(){return Nl().memoizedState}function g0(){return Nl().memoizedState}function ly(l){for(var t=l.return;t!==null;){switch(t.tag){case 24:case 3:var e=rt();l=ce(e);var a=ie(t,l,e);a!==null&&(at(a,t,e),wa(a,t,e)),t={cache:Rc()},l.payload=t;return}t=t.return}}function ty(l,t,e){var a=rt();e={lane:a,revertLane:0,gesture:null,action:e,hasEagerState:!1,eagerState:null,next:null},cn(l)?b0(t,e):(e=Tc(l,t,e,a),e!==null&&(at(e,l,a),z0(e,t,a)))}function S0(l,t,e){var a=rt();Ia(l,t,e,a)}function Ia(l,t,e,a){var u={lane:a,revertLane:0,gesture:null,action:e,hasEagerState:!1,eagerState:null,next:null};if(cn(l))b0(t,u);else{var n=l.alternate;if(l.lanes===0&&(n===null||n.lanes===0)&&(n=t.lastRenderedReducer,n!==null))try{var c=t.lastRenderedState,i=n(c,e);if(u.hasEagerState=!0,u.eagerState=i,ft(i,c))return Qu(l,t,u,0),El===null&&Gu(),!1}catch{}if(e=Tc(l,t,u,a),e!==null)return at(e,l,a),z0(e,t,a),!0}return!1}function ni(l,t,e,a){if(a={lane:2,revertLane:qi(),gesture:null,action:a,hasEagerState:!1,eagerState:null,next:null},cn(l)){if(t)throw Error(o(479))}else t=Tc(l,e,a,2),t!==null&&at(t,l,2)}function cn(l){var t=l.alternate;return l===J||t!==null&&t===J}function b0(l,t){oa=Pu=!0;var e=l.pending;e===null?t.next=t:(t.next=e.next,e.next=t),l.pending=t}function z0(l,t,e){if((e&4194048)!==0){var a=t.lanes;a&=l.pendingLanes,e|=a,t.lanes=e,_f(l,e)}}var Pa={readContext:Zl,use:en,useCallback:Ol,useContext:Ol,useEffect:Ol,useImperativeHandle:Ol,useLayoutEffect:Ol,useInsertionEffect:Ol,useMemo:Ol,useReducer:Ol,useRef:Ol,useState:Ol,useDebugValue:Ol,useDeferredValue:Ol,useTransition:Ol,useSyncExternalStore:Ol,useId:Ol,useHostTransitionStatus:Ol,useFormState:Ol,useActionState:Ol,useOptimistic:Ol,useMemoCache:Ol,useCacheRefresh:Ol};Pa.useEffectEvent=Ol;var E0={readContext:Zl,use:en,useCallback:function(l,t){return Wl().memoizedState=[l,t===void 0?null:t],l},useContext:Zl,useEffect:u0,useImperativeHandle:function(l,t,e){e=e!=null?e.concat([l]):null,un(4194308,4,f0.bind(null,t,l),e)},useLayoutEffect:function(l,t){return un(4194308,4,l,t)},useInsertionEffect:function(l,t){un(4,2,l,t)},useMemo:function(l,t){var e=Wl();t=t===void 0?null:t;var a=l();if(Ye){It(!0);try{l()}finally{It(!1)}}return e.memoizedState=[a,t],a},useReducer:function(l,t,e){var a=Wl();if(e!==void 0){var u=e(t);if(Ye){It(!0);try{e(t)}finally{It(!1)}}}else u=t;return a.memoizedState=a.baseState=u,l={pending:null,lanes:0,dispatch:null,lastRenderedReducer:l,lastRenderedState:u},a.queue=l,l=l.dispatch=ty.bind(null,J,l),[a.memoizedState,l]},useRef:function(l){var t=Wl();return l={current:l},t.memoizedState=l},useState:function(l){l=Ic(l);var t=l.queue,e=S0.bind(null,J,t);return t.dispatch=e,[l.memoizedState,e]},useDebugValue:ti,useDeferredValue:function(l,t){var e=Wl();return ei(e,l,t)},useTransition:function(){var l=Ic(!1);return l=y0.bind(null,J,l.queue,!0,!1),Wl().memoizedState=l,[!1,l]},useSyncExternalStore:function(l,t,e){var a=J,u=Wl();if(el){if(e===void 0)throw Error(o(407));e=e()}else{if(e=t(),El===null)throw Error(o(349));(ll&127)!==0||Zs(a,t,e)}u.memoizedState=e;var n={value:e,getSnapshot:t};return u.queue=n,u0(Ls.bind(null,a,n,l),[l]),a.flags|=2048,ya(9,{destroy:void 0},Vs.bind(null,a,n,e,t),null),e},useId:function(){var l=Wl(),t=El.identifierPrefix;if(el){var e=Nt,a=Dt;e=(a&~(1<<32-it(a)-1)).toString(32)+e,t="_"+t+"R_"+e,e=ln++,0<e&&(t+="H"+e.toString(32)),t+="_"}else e=Wm++,t="_"+t+"r_"+e.toString(32)+"_";return l.memoizedState=t},useHostTransitionStatus:ui,useFormState:Ps,useActionState:Ps,useOptimistic:function(l){var t=Wl();t.memoizedState=t.baseState=l;var e={pending:null,lanes:0,dispatch:null,lastRenderedReducer:null,lastRenderedState:null};return t.queue=e,t=ni.bind(null,J,!0,e),e.dispatch=t,[l,t]},useMemoCache:$c,useCacheRefresh:function(){return Wl().memoizedState=ly.bind(null,J)},useEffectEvent:function(l){var t=Wl(),e={impl:l};return t.memoizedState=e,function(){if((sl&2)!==0)throw Error(o(440));return e.impl.apply(void 0,arguments)}}},ci={readContext:Zl,use:en,useCallback:d0,useContext:Zl,useEffect:li,useImperativeHandle:s0,useInsertionEffect:c0,useLayoutEffect:i0,useMemo:o0,useReducer:an,useRef:a0,useState:function(){return an(Zt)},useDebugValue:ti,useDeferredValue:function(l,t){var e=Nl();return m0(e,gl.memoizedState,l,t)},useTransition:function(){var l=an(Zt)[0],t=Nl().memoizedState;return[typeof l=="boolean"?l:Fa(l),t]},useSyncExternalStore:Xs,useId:v0,useHostTransitionStatus:ui,useFormState:l0,useActionState:l0,useOptimistic:function(l,t){var e=Nl();return ws(e,gl,l,t)},useMemoCache:$c,useCacheRefresh:g0};ci.useEffectEvent=n0;var T0={readContext:Zl,use:en,useCallback:d0,useContext:Zl,useEffect:li,useImperativeHandle:s0,useInsertionEffect:c0,useLayoutEffect:i0,useMemo:o0,useReducer:Fc,useRef:a0,useState:function(){return Fc(Zt)},useDebugValue:ti,useDeferredValue:function(l,t){var e=Nl();return gl===null?ei(e,l,t):m0(e,gl.memoizedState,l,t)},useTransition:function(){var l=Fc(Zt)[0],t=Nl().memoizedState;return[typeof l=="boolean"?l:Fa(l),t]},useSyncExternalStore:Xs,useId:v0,useHostTransitionStatus:ui,useFormState:e0,useActionState:e0,useOptimistic:function(l,t){var e=Nl();return gl!==null?ws(e,gl,l,t):(e.baseState=l,[l,e.queue.dispatch])},useMemoCache:$c,useCacheRefresh:g0};T0.useEffectEvent=n0;function ii(l,t,e,a){t=l.memoizedState,e=e(a,t),e=e==null?t:U({},t,e),l.memoizedState=e,l.lanes===0&&(l.updateQueue.baseState=e)}var fi={enqueueSetState:function(l,t,e){l=l._reactInternals;var a=rt(),u=ce(a);u.payload=t,e!=null&&(u.callback=e),t=ie(l,u,a),t!==null&&(at(t,l,a),wa(t,l,a))},enqueueReplaceState:function(l,t,e){l=l._reactInternals;var a=rt(),u=ce(a);u.tag=1,u.payload=t,e!=null&&(u.callback=e),t=ie(l,u,a),t!==null&&(at(t,l,a),wa(t,l,a))},enqueueForceUpdate:function(l,t){l=l._reactInternals;var e=rt(),a=ce(e);a.tag=2,t!=null&&(a.callback=t),t=ie(l,a,e),t!==null&&(at(t,l,e),wa(t,l,e))}};function A0(l,t,e,a,u,n,c){return l=l.stateNode,typeof l.shouldComponentUpdate=="function"?l.shouldComponentUpdate(a,n,c):t.prototype&&t.prototype.isPureReactComponent?!Ga(e,a)||!Ga(u,n):!0}function p0(l,t,e,a){l=t.state,typeof t.componentWillReceiveProps=="function"&&t.componentWillReceiveProps(e,a),typeof t.UNSAFE_componentWillReceiveProps=="function"&&t.UNSAFE_componentWillReceiveProps(e,a),t.state!==l&&fi.enqueueReplaceState(t,t.state,null)}function Ge(l,t){var e=t;if("ref"in t){e={};for(var a in t)a!=="ref"&&(e[a]=t[a])}if(l=l.defaultProps){e===t&&(e=U({},e));for(var u in l)e[u]===void 0&&(e[u]=l[u])}return e}function _0(l){Yu(l)}function O0(l){console.error(l)}function M0(l){Yu(l)}function fn(l,t){try{var e=l.onUncaughtError;e(t.value,{componentStack:t.stack})}catch(a){setTimeout(function(){throw a})}}function x0(l,t,e){try{var a=l.onCaughtError;a(e.value,{componentStack:e.stack,errorBoundary:t.tag===1?t.stateNode:null})}catch(u){setTimeout(function(){throw u})}}function si(l,t,e){return e=ce(e),e.tag=3,e.payload={element:null},e.callback=function(){fn(l,t)},e}function D0(l){return l=ce(l),l.tag=3,l}function N0(l,t,e,a){var u=e.type.getDerivedStateFromError;if(typeof u=="function"){var n=a.value;l.payload=function(){return u(n)},l.callback=function(){x0(t,e,a)}}var c=e.stateNode;c!==null&&typeof c.componentDidCatch=="function"&&(l.callback=function(){x0(t,e,a),typeof u!="function"&&(ye===null?ye=new Set([this]):ye.add(this));var i=a.stack;this.componentDidCatch(a.value,{componentStack:i!==null?i:""})})}function ey(l,t,e,a,u){if(e.flags|=32768,a!==null&&typeof a=="object"&&typeof a.then=="function"){if(t=e.alternate,t!==null&&na(t,e,u,!0),e=dt.current,e!==null){switch(e.tag){case 31:case 13:return Tt===null?zn():e.alternate===null&&Ml===0&&(Ml=3),e.flags&=-257,e.flags|=65536,e.lanes=u,a===Wu?e.flags|=16384:(t=e.updateQueue,t===null?e.updateQueue=new Set([a]):t.add(a),ji(l,a,u)),!1;case 22:return e.flags|=65536,a===Wu?e.flags|=16384:(t=e.updateQueue,t===null?(t={transitions:null,markerInstances:null,retryQueue:new Set([a])},e.updateQueue=t):(e=t.retryQueue,e===null?t.retryQueue=new Set([a]):e.add(a)),ji(l,a,u)),!1}throw Error(o(435,e.tag))}return ji(l,a,u),zn(),!1}if(el)return t=dt.current,t!==null?((t.flags&65536)===0&&(t.flags|=256),t.flags|=65536,t.lanes=u,a!==xc&&(l=Error(o(422),{cause:a}),Za(St(l,e)))):(a!==xc&&(t=Error(o(423),{cause:a}),Za(St(t,e))),l=l.current.alternate,l.flags|=65536,u&=-u,l.lanes|=u,a=St(a,e),u=si(l.stateNode,a,u),Gc(l,u),Ml!==4&&(Ml=2)),!1;var n=Error(o(520),{cause:a});if(n=St(n,e),iu===null?iu=[n]:iu.push(n),Ml!==4&&(Ml=2),t===null)return!0;a=St(a,e),e=t;do{switch(e.tag){case 3:return e.flags|=65536,l=u&-u,e.lanes|=l,l=si(e.stateNode,a,l),Gc(e,l),!1;case 1:if(t=e.type,n=e.stateNode,(e.flags&128)===0&&(typeof t.getDerivedStateFromError=="function"||n!==null&&typeof n.componentDidCatch=="function"&&(ye===null||!ye.has(n))))return e.flags|=65536,u&=-u,e.lanes|=u,u=D0(u),N0(u,l,e,a),Gc(e,u),!1}e=e.return}while(e!==null);return!1}var di=Error(o(461)),Rl=!1;function Vl(l,t,e,a){t.child=l===null?js(t,null,e,a):qe(t,l.child,e,a)}function U0(l,t,e,a,u){e=e.render;var n=t.ref;if("ref"in a){var c={};for(var i in a)i!=="ref"&&(c[i]=a[i])}else c=a;return Re(t),a=Kc(l,t,e,c,n,u),i=Jc(),l!==null&&!Rl?(wc(l,t,u),Vt(l,t,u)):(el&&i&&Oc(t),t.flags|=1,Vl(l,t,a,u),t.child)}function C0(l,t,e,a,u){if(l===null){var n=e.type;return typeof n=="function"&&!Ac(n)&&n.defaultProps===void 0&&e.compare===null?(t.tag=15,t.type=n,R0(l,t,n,a,u)):(l=Zu(e.type,null,a,t,t.mode,u),l.ref=t.ref,l.return=t,t.child=l)}if(n=l.child,!Si(l,u)){var c=n.memoizedProps;if(e=e.compare,e=e!==null?e:Ga,e(c,a)&&l.ref===t.ref)return Vt(l,t,u)}return t.flags|=1,l=qt(n,a),l.ref=t.ref,l.return=t,t.child=l}function R0(l,t,e,a,u){if(l!==null){var n=l.memoizedProps;if(Ga(n,a)&&l.ref===t.ref)if(Rl=!1,t.pendingProps=a=n,Si(l,u))(l.flags&131072)!==0&&(Rl=!0);else return t.lanes=l.lanes,Vt(l,t,u)}return oi(l,t,e,a,u)}function j0(l,t,e,a){var u=a.children,n=l!==null?l.memoizedState:null;if(l===null&&t.stateNode===null&&(t.stateNode={_visibility:1,_pendingMarkers:null,_retryCache:null,_transitions:null}),a.mode==="hidden"){if((t.flags&128)!==0){if(n=n!==null?n.baseLanes|e:e,l!==null){for(a=t.child=l.child,u=0;a!==null;)u=u|a.lanes|a.childLanes,a=a.sibling;a=u&~n}else a=0,t.child=null;return H0(l,t,n,e,a)}if((e&536870912)!==0)t.memoizedState={baseLanes:0,cachePool:null},l!==null&&Ju(t,n!==null?n.cachePool:null),n!==null?qs(t,n):Xc(),Ys(t);else return a=t.lanes=536870912,H0(l,t,n!==null?n.baseLanes|e:e,e,a)}else n!==null?(Ju(t,n.cachePool),qs(t,n),se(),t.memoizedState=null):(l!==null&&Ju(t,null),Xc(),se());return Vl(l,t,u,e),t.child}function lu(l,t){return l!==null&&l.tag===22||t.stateNode!==null||(t.stateNode={_visibility:1,_pendingMarkers:null,_retryCache:null,_transitions:null}),t.sibling}function H0(l,t,e,a,u){var n=Hc();return n=n===null?null:{parent:Ul._currentValue,pool:n},t.memoizedState={baseLanes:e,cachePool:n},l!==null&&Ju(t,null),Xc(),Ys(t),l!==null&&na(l,t,a,!0),t.childLanes=u,null}function sn(l,t){return t=on({mode:t.mode,children:t.children},l.mode),t.ref=l.ref,l.child=t,t.return=l,t}function B0(l,t,e){return qe(t,l.child,null,e),l=sn(t,t.pendingProps),l.flags|=2,ot(t),t.memoizedState=null,l}function ay(l,t,e){var a=t.pendingProps,u=(t.flags&128)!==0;if(t.flags&=-129,l===null){if(el){if(a.mode==="hidden")return l=sn(t,a),t.lanes=536870912,lu(null,l);if(Vc(t),(l=Al)?(l=Wd(l,Et),l=l!==null&&l.data==="&"?l:null,l!==null&&(t.memoizedState={dehydrated:l,treeContext:te!==null?{id:Dt,overflow:Nt}:null,retryLane:536870912,hydrationErrors:null},e=bs(l),e.return=t,t.child=e,Xl=t,Al=null)):l=null,l===null)throw ae(t);return t.lanes=536870912,null}return sn(t,a)}var n=l.memoizedState;if(n!==null){var c=n.dehydrated;if(Vc(t),u)if(t.flags&256)t.flags&=-257,t=B0(l,t,e);else if(t.memoizedState!==null)t.child=l.child,t.flags|=128,t=null;else throw Error(o(558));else if(Rl||na(l,t,e,!1),u=(e&l.childLanes)!==0,Rl||u){if(a=El,a!==null&&(c=Of(a,e),c!==0&&c!==n.retryLane))throw n.retryLane=c,De(l,c),at(a,l,c),di;zn(),t=B0(l,t,e)}else l=n.treeContext,Al=At(c.nextSibling),Xl=t,el=!0,ee=null,Et=!1,l!==null&&Ts(t,l),t=sn(t,a),t.flags|=4096;return t}return l=qt(l.child,{mode:a.mode,children:a.children}),l.ref=t.ref,t.child=l,l.return=t,l}function dn(l,t){var e=t.ref;if(e===null)l!==null&&l.ref!==null&&(t.flags|=4194816);else{if(typeof e!="function"&&typeof e!="object")throw Error(o(284));(l===null||l.ref!==e)&&(t.flags|=4194816)}}function oi(l,t,e,a,u){return Re(t),e=Kc(l,t,e,a,void 0,u),a=Jc(),l!==null&&!Rl?(wc(l,t,u),Vt(l,t,u)):(el&&a&&Oc(t),t.flags|=1,Vl(l,t,e,u),t.child)}function q0(l,t,e,a,u,n){return Re(t),t.updateQueue=null,e=Qs(t,a,e,u),Gs(l),a=Jc(),l!==null&&!Rl?(wc(l,t,n),Vt(l,t,n)):(el&&a&&Oc(t),t.flags|=1,Vl(l,t,e,n),t.child)}function Y0(l,t,e,a,u){if(Re(t),t.stateNode===null){var n=ta,c=e.contextType;typeof c=="object"&&c!==null&&(n=Zl(c)),n=new e(a,n),t.memoizedState=n.state!==null&&n.state!==void 0?n.state:null,n.updater=fi,t.stateNode=n,n._reactInternals=t,n=t.stateNode,n.props=a,n.state=t.memoizedState,n.refs={},qc(t),c=e.contextType,n.context=typeof c=="object"&&c!==null?Zl(c):ta,n.state=t.memoizedState,c=e.getDerivedStateFromProps,typeof c=="function"&&(ii(t,e,c,a),n.state=t.memoizedState),typeof e.getDerivedStateFromProps=="function"||typeof n.getSnapshotBeforeUpdate=="function"||typeof n.UNSAFE_componentWillMount!="function"&&typeof n.componentWillMount!="function"||(c=n.state,typeof n.componentWillMount=="function"&&n.componentWillMount(),typeof n.UNSAFE_componentWillMount=="function"&&n.UNSAFE_componentWillMount(),c!==n.state&&fi.enqueueReplaceState(n,n.state,null),$a(t,a,n,u),Wa(),n.state=t.memoizedState),typeof n.componentDidMount=="function"&&(t.flags|=4194308),a=!0}else if(l===null){n=t.stateNode;var i=t.memoizedProps,f=Ge(e,i);n.props=f;var r=n.context,b=e.contextType;c=ta,typeof b=="object"&&b!==null&&(c=Zl(b));var A=e.getDerivedStateFromProps;b=typeof A=="function"||typeof n.getSnapshotBeforeUpdate=="function",i=t.pendingProps!==i,b||typeof n.UNSAFE_componentWillReceiveProps!="function"&&typeof n.componentWillReceiveProps!="function"||(i||r!==c)&&p0(t,n,a,c),ne=!1;var v=t.memoizedState;n.state=v,$a(t,a,n,u),Wa(),r=t.memoizedState,i||v!==r||ne?(typeof A=="function"&&(ii(t,e,A,a),r=t.memoizedState),(f=ne||A0(t,e,f,a,v,r,c))?(b||typeof n.UNSAFE_componentWillMount!="function"&&typeof n.componentWillMount!="function"||(typeof n.componentWillMount=="function"&&n.componentWillMount(),typeof n.UNSAFE_componentWillMount=="function"&&n.UNSAFE_componentWillMount()),typeof n.componentDidMount=="function"&&(t.flags|=4194308)):(typeof n.componentDidMount=="function"&&(t.flags|=4194308),t.memoizedProps=a,t.memoizedState=r),n.props=a,n.state=r,n.context=c,a=f):(typeof n.componentDidMount=="function"&&(t.flags|=4194308),a=!1)}else{n=t.stateNode,Yc(l,t),c=t.memoizedProps,b=Ge(e,c),n.props=b,A=t.pendingProps,v=n.context,r=e.contextType,f=ta,typeof r=="object"&&r!==null&&(f=Zl(r)),i=e.getDerivedStateFromProps,(r=typeof i=="function"||typeof n.getSnapshotBeforeUpdate=="function")||typeof n.UNSAFE_componentWillReceiveProps!="function"&&typeof n.componentWillReceiveProps!="function"||(c!==A||v!==f)&&p0(t,n,a,f),ne=!1,v=t.memoizedState,n.state=v,$a(t,a,n,u),Wa();var S=t.memoizedState;c!==A||v!==S||ne||l!==null&&l.dependencies!==null&&Lu(l.dependencies)?(typeof i=="function"&&(ii(t,e,i,a),S=t.memoizedState),(b=ne||A0(t,e,b,a,v,S,f)||l!==null&&l.dependencies!==null&&Lu(l.dependencies))?(r||typeof n.UNSAFE_componentWillUpdate!="function"&&typeof n.componentWillUpdate!="function"||(typeof n.componentWillUpdate=="function"&&n.componentWillUpdate(a,S,f),typeof n.UNSAFE_componentWillUpdate=="function"&&n.UNSAFE_componentWillUpdate(a,S,f)),typeof n.componentDidUpdate=="function"&&(t.flags|=4),typeof n.getSnapshotBeforeUpdate=="function"&&(t.flags|=1024)):(typeof n.componentDidUpdate!="function"||c===l.memoizedProps&&v===l.memoizedState||(t.flags|=4),typeof n.getSnapshotBeforeUpdate!="function"||c===l.memoizedProps&&v===l.memoizedState||(t.flags|=1024),t.memoizedProps=a,t.memoizedState=S),n.props=a,n.state=S,n.context=f,a=b):(typeof n.componentDidUpdate!="function"||c===l.memoizedProps&&v===l.memoizedState||(t.flags|=4),typeof n.getSnapshotBeforeUpdate!="function"||c===l.memoizedProps&&v===l.memoizedState||(t.flags|=1024),a=!1)}return n=a,dn(l,t),a=(t.flags&128)!==0,n||a?(n=t.stateNode,e=a&&typeof e.getDerivedStateFromError!="function"?null:n.render(),t.flags|=1,l!==null&&a?(t.child=qe(t,l.child,null,u),t.child=qe(t,null,e,u)):Vl(l,t,e,u),t.memoizedState=n.state,l=t.child):l=Vt(l,t,u),l}function G0(l,t,e,a){return Ue(),t.flags|=256,Vl(l,t,e,a),t.child}var mi={dehydrated:null,treeContext:null,retryLane:0,hydrationErrors:null};function yi(l){return{baseLanes:l,cachePool:xs()}}function ri(l,t,e){return l=l!==null?l.childLanes&~e:0,t&&(l|=yt),l}function Q0(l,t,e){var a=t.pendingProps,u=!1,n=(t.flags&128)!==0,c;if((c=n)||(c=l!==null&&l.memoizedState===null?!1:(Dl.current&2)!==0),c&&(u=!0,t.flags&=-129),c=(t.flags&32)!==0,t.flags&=-33,l===null){if(el){if(u?fe(t):se(),(l=Al)?(l=Wd(l,Et),l=l!==null&&l.data!=="&"?l:null,l!==null&&(t.memoizedState={dehydrated:l,treeContext:te!==null?{id:Dt,overflow:Nt}:null,retryLane:536870912,hydrationErrors:null},e=bs(l),e.return=t,t.child=e,Xl=t,Al=null)):l=null,l===null)throw ae(t);return ki(l)?t.lanes=32:t.lanes=536870912,null}var i=a.children;return a=a.fallback,u?(se(),u=t.mode,i=on({mode:"hidden",children:i},u),a=Ne(a,u,e,null),i.return=t,a.return=t,i.sibling=a,t.child=i,a=t.child,a.memoizedState=yi(e),a.childLanes=ri(l,c,e),t.memoizedState=mi,lu(null,a)):(fe(t),hi(t,i))}var f=l.memoizedState;if(f!==null&&(i=f.dehydrated,i!==null)){if(n)t.flags&256?(fe(t),t.flags&=-257,t=vi(l,t,e)):t.memoizedState!==null?(se(),t.child=l.child,t.flags|=128,t=null):(se(),i=a.fallback,u=t.mode,a=on({mode:"visible",children:a.children},u),i=Ne(i,u,e,null),i.flags|=2,a.return=t,i.return=t,a.sibling=i,t.child=a,qe(t,l.child,null,e),a=t.child,a.memoizedState=yi(e),a.childLanes=ri(l,c,e),t.memoizedState=mi,t=lu(null,a));else if(fe(t),ki(i)){if(c=i.nextSibling&&i.nextSibling.dataset,c)var r=c.dgst;c=r,a=Error(o(419)),a.stack="",a.digest=c,Za({value:a,source:null,stack:null}),t=vi(l,t,e)}else if(Rl||na(l,t,e,!1),c=(e&l.childLanes)!==0,Rl||c){if(c=El,c!==null&&(a=Of(c,e),a!==0&&a!==f.retryLane))throw f.retryLane=a,De(l,a),at(c,l,a),di;$i(i)||zn(),t=vi(l,t,e)}else $i(i)?(t.flags|=192,t.child=l.child,t=null):(l=f.treeContext,Al=At(i.nextSibling),Xl=t,el=!0,ee=null,Et=!1,l!==null&&Ts(t,l),t=hi(t,a.children),t.flags|=4096);return t}return u?(se(),i=a.fallback,u=t.mode,f=l.child,r=f.sibling,a=qt(f,{mode:"hidden",children:a.children}),a.subtreeFlags=f.subtreeFlags&65011712,r!==null?i=qt(r,i):(i=Ne(i,u,e,null),i.flags|=2),i.return=t,a.return=t,a.sibling=i,t.child=a,lu(null,a),a=t.child,i=l.child.memoizedState,i===null?i=yi(e):(u=i.cachePool,u!==null?(f=Ul._currentValue,u=u.parent!==f?{parent:f,pool:f}:u):u=xs(),i={baseLanes:i.baseLanes|e,cachePool:u}),a.memoizedState=i,a.childLanes=ri(l,c,e),t.memoizedState=mi,lu(l.child,a)):(fe(t),e=l.child,l=e.sibling,e=qt(e,{mode:"visible",children:a.children}),e.return=t,e.sibling=null,l!==null&&(c=t.deletions,c===null?(t.deletions=[l],t.flags|=16):c.push(l)),t.child=e,t.memoizedState=null,e)}function hi(l,t){return t=on({mode:"visible",children:t},l.mode),t.return=l,l.child=t}function on(l,t){return l=st(22,l,null,t),l.lanes=0,l}function vi(l,t,e){return qe(t,l.child,null,e),l=hi(t,t.pendingProps.children),l.flags|=2,t.memoizedState=null,l}function X0(l,t,e){l.lanes|=t;var a=l.alternate;a!==null&&(a.lanes|=t),Uc(l.return,t,e)}function gi(l,t,e,a,u,n){var c=l.memoizedState;c===null?l.memoizedState={isBackwards:t,rendering:null,renderingStartTime:0,last:a,tail:e,tailMode:u,treeForkCount:n}:(c.isBackwards=t,c.rendering=null,c.renderingStartTime=0,c.last=a,c.tail=e,c.tailMode=u,c.treeForkCount=n)}function Z0(l,t,e){var a=t.pendingProps,u=a.revealOrder,n=a.tail;a=a.children;var c=Dl.current,i=(c&2)!==0;if(i?(c=c&1|2,t.flags|=128):c&=1,O(Dl,c),Vl(l,t,a,e),a=el?Xa:0,!i&&l!==null&&(l.flags&128)!==0)l:for(l=t.child;l!==null;){if(l.tag===13)l.memoizedState!==null&&X0(l,e,t);else if(l.tag===19)X0(l,e,t);else if(l.child!==null){l.child.return=l,l=l.child;continue}if(l===t)break l;for(;l.sibling===null;){if(l.return===null||l.return===t)break l;l=l.return}l.sibling.return=l.return,l=l.sibling}switch(u){case"forwards":for(e=t.child,u=null;e!==null;)l=e.alternate,l!==null&&Iu(l)===null&&(u=e),e=e.sibling;e=u,e===null?(u=t.child,t.child=null):(u=e.sibling,e.sibling=null),gi(t,!1,u,e,n,a);break;case"backwards":case"unstable_legacy-backwards":for(e=null,u=t.child,t.child=null;u!==null;){if(l=u.alternate,l!==null&&Iu(l)===null){t.child=u;break}l=u.sibling,u.sibling=e,e=u,u=l}gi(t,!0,e,null,n,a);break;case"together":gi(t,!1,null,null,void 0,a);break;default:t.memoizedState=null}return t.child}function Vt(l,t,e){if(l!==null&&(t.dependencies=l.dependencies),me|=t.lanes,(e&t.childLanes)===0)if(l!==null){if(na(l,t,e,!1),(e&t.childLanes)===0)return null}else return null;if(l!==null&&t.child!==l.child)throw Error(o(153));if(t.child!==null){for(l=t.child,e=qt(l,l.pendingProps),t.child=e,e.return=t;l.sibling!==null;)l=l.sibling,e=e.sibling=qt(l,l.pendingProps),e.return=t;e.sibling=null}return t.child}function Si(l,t){return(l.lanes&t)!==0?!0:(l=l.dependencies,!!(l!==null&&Lu(l)))}function uy(l,t,e){switch(t.tag){case 3:wl(t,t.stateNode.containerInfo),ue(t,Ul,l.memoizedState.cache),Ue();break;case 27:case 5:Oa(t);break;case 4:wl(t,t.stateNode.containerInfo);break;case 10:ue(t,t.type,t.memoizedProps.value);break;case 31:if(t.memoizedState!==null)return t.flags|=128,Vc(t),null;break;case 13:var a=t.memoizedState;if(a!==null)return a.dehydrated!==null?(fe(t),t.flags|=128,null):(e&t.child.childLanes)!==0?Q0(l,t,e):(fe(t),l=Vt(l,t,e),l!==null?l.sibling:null);fe(t);break;case 19:var u=(l.flags&128)!==0;if(a=(e&t.childLanes)!==0,a||(na(l,t,e,!1),a=(e&t.childLanes)!==0),u){if(a)return Z0(l,t,e);t.flags|=128}if(u=t.memoizedState,u!==null&&(u.rendering=null,u.tail=null,u.lastEffect=null),O(Dl,Dl.current),a)break;return null;case 22:return t.lanes=0,j0(l,t,e,t.pendingProps);case 24:ue(t,Ul,l.memoizedState.cache)}return Vt(l,t,e)}function V0(l,t,e){if(l!==null)if(l.memoizedProps!==t.pendingProps)Rl=!0;else{if(!Si(l,e)&&(t.flags&128)===0)return Rl=!1,uy(l,t,e);Rl=(l.flags&131072)!==0}else Rl=!1,el&&(t.flags&1048576)!==0&&Es(t,Xa,t.index);switch(t.lanes=0,t.tag){case 16:l:{var a=t.pendingProps;if(l=He(t.elementType),t.type=l,typeof l=="function")Ac(l)?(a=Ge(l,a),t.tag=1,t=Y0(null,t,l,a,e)):(t.tag=0,t=oi(null,t,l,a,e));else{if(l!=null){var u=l.$$typeof;if(u===al){t.tag=11,t=U0(null,t,l,a,e);break l}else if(u===W){t.tag=14,t=C0(null,t,l,a,e);break l}}throw t=_t(l)||l,Error(o(306,t,""))}}return t;case 0:return oi(l,t,t.type,t.pendingProps,e);case 1:return a=t.type,u=Ge(a,t.pendingProps),Y0(l,t,a,u,e);case 3:l:{if(wl(t,t.stateNode.containerInfo),l===null)throw Error(o(387));a=t.pendingProps;var n=t.memoizedState;u=n.element,Yc(l,t),$a(t,a,null,e);var c=t.memoizedState;if(a=c.cache,ue(t,Ul,a),a!==n.cache&&Cc(t,[Ul],e,!0),Wa(),a=c.element,n.isDehydrated)if(n={element:a,isDehydrated:!1,cache:c.cache},t.updateQueue.baseState=n,t.memoizedState=n,t.flags&256){t=G0(l,t,a,e);break l}else if(a!==u){u=St(Error(o(424)),t),Za(u),t=G0(l,t,a,e);break l}else for(l=t.stateNode.containerInfo,l.nodeType===9?l=l.body:l=l.nodeName==="HTML"?l.ownerDocument.body:l,Al=At(l.firstChild),Xl=t,el=!0,ee=null,Et=!0,e=js(t,null,a,e),t.child=e;e;)e.flags=e.flags&-3|4096,e=e.sibling;else{if(Ue(),a===u){t=Vt(l,t,e);break l}Vl(l,t,a,e)}t=t.child}return t;case 26:return dn(l,t),l===null?(e=lo(t.type,null,t.pendingProps,null))?t.memoizedState=e:el||(e=t.type,l=t.pendingProps,a=Mn(K.current).createElement(e),a[Ql]=t,a[Fl]=l,Ll(a,e,l),Yl(a),t.stateNode=a):t.memoizedState=lo(t.type,l.memoizedProps,t.pendingProps,l.memoizedState),null;case 27:return Oa(t),l===null&&el&&(a=t.stateNode=Fd(t.type,t.pendingProps,K.current),Xl=t,Et=!0,u=Al,ge(t.type)?(Fi=u,Al=At(a.firstChild)):Al=u),Vl(l,t,t.pendingProps.children,e),dn(l,t),l===null&&(t.flags|=4194304),t.child;case 5:return l===null&&el&&((u=a=Al)&&(a=jy(a,t.type,t.pendingProps,Et),a!==null?(t.stateNode=a,Xl=t,Al=At(a.firstChild),Et=!1,u=!0):u=!1),u||ae(t)),Oa(t),u=t.type,n=t.pendingProps,c=l!==null?l.memoizedProps:null,a=n.children,Ji(u,n)?a=null:c!==null&&Ji(u,c)&&(t.flags|=32),t.memoizedState!==null&&(u=Kc(l,t,$m,null,null,e),hu._currentValue=u),dn(l,t),Vl(l,t,a,e),t.child;case 6:return l===null&&el&&((l=e=Al)&&(e=Hy(e,t.pendingProps,Et),e!==null?(t.stateNode=e,Xl=t,Al=null,l=!0):l=!1),l||ae(t)),null;case 13:return Q0(l,t,e);case 4:return wl(t,t.stateNode.containerInfo),a=t.pendingProps,l===null?t.child=qe(t,null,a,e):Vl(l,t,a,e),t.child;case 11:return U0(l,t,t.type,t.pendingProps,e);case 7:return Vl(l,t,t.pendingProps,e),t.child;case 8:return Vl(l,t,t.pendingProps.children,e),t.child;case 12:return Vl(l,t,t.pendingProps.children,e),t.child;case 10:return a=t.pendingProps,ue(t,t.type,a.value),Vl(l,t,a.children,e),t.child;case 9:return u=t.type._context,a=t.pendingProps.children,Re(t),u=Zl(u),a=a(u),t.flags|=1,Vl(l,t,a,e),t.child;case 14:return C0(l,t,t.type,t.pendingProps,e);case 15:return R0(l,t,t.type,t.pendingProps,e);case 19:return Z0(l,t,e);case 31:return ay(l,t,e);case 22:return j0(l,t,e,t.pendingProps);case 24:return Re(t),a=Zl(Ul),l===null?(u=Hc(),u===null&&(u=El,n=Rc(),u.pooledCache=n,n.refCount++,n!==null&&(u.pooledCacheLanes|=e),u=n),t.memoizedState={parent:a,cache:u},qc(t),ue(t,Ul,u)):((l.lanes&e)!==0&&(Yc(l,t),$a(t,null,null,e),Wa()),u=l.memoizedState,n=t.memoizedState,u.parent!==a?(u={parent:a,cache:a},t.memoizedState=u,t.lanes===0&&(t.memoizedState=t.updateQueue.baseState=u),ue(t,Ul,a)):(a=n.cache,ue(t,Ul,a),a!==u.cache&&Cc(t,[Ul],e,!0))),Vl(l,t,t.pendingProps.children,e),t.child;case 29:throw t.pendingProps}throw Error(o(156,t.tag))}function Lt(l){l.flags|=4}function bi(l,t,e,a,u){if((t=(l.mode&32)!==0)&&(t=!1),t){if(l.flags|=16777216,(u&335544128)===u)if(l.stateNode.complete)l.flags|=8192;else if(vd())l.flags|=8192;else throw Be=Wu,Bc}else l.flags&=-16777217}function L0(l,t){if(t.type!=="stylesheet"||(t.state.loading&4)!==0)l.flags&=-16777217;else if(l.flags|=16777216,!no(t))if(vd())l.flags|=8192;else throw Be=Wu,Bc}function mn(l,t){t!==null&&(l.flags|=4),l.flags&16384&&(t=l.tag!==22?Af():536870912,l.lanes|=t,ga|=t)}function tu(l,t){if(!el)switch(l.tailMode){case"hidden":t=l.tail;for(var e=null;t!==null;)t.alternate!==null&&(e=t),t=t.sibling;e===null?l.tail=null:e.sibling=null;break;case"collapsed":e=l.tail;for(var a=null;e!==null;)e.alternate!==null&&(a=e),e=e.sibling;a===null?t||l.tail===null?l.tail=null:l.tail.sibling=null:a.sibling=null}}function pl(l){var t=l.alternate!==null&&l.alternate.child===l.child,e=0,a=0;if(t)for(var u=l.child;u!==null;)e|=u.lanes|u.childLanes,a|=u.subtreeFlags&65011712,a|=u.flags&65011712,u.return=l,u=u.sibling;else for(u=l.child;u!==null;)e|=u.lanes|u.childLanes,a|=u.subtreeFlags,a|=u.flags,u.return=l,u=u.sibling;return l.subtreeFlags|=a,l.childLanes=e,t}function ny(l,t,e){var a=t.pendingProps;switch(Mc(t),t.tag){case 16:case 15:case 0:case 11:case 7:case 8:case 12:case 9:case 14:return pl(t),null;case 1:return pl(t),null;case 3:return e=t.stateNode,a=null,l!==null&&(a=l.memoizedState.cache),t.memoizedState.cache!==a&&(t.flags|=2048),Qt(Ul),xl(),e.pendingContext&&(e.context=e.pendingContext,e.pendingContext=null),(l===null||l.child===null)&&(ua(t)?Lt(t):l===null||l.memoizedState.isDehydrated&&(t.flags&256)===0||(t.flags|=1024,Dc())),pl(t),null;case 26:var u=t.type,n=t.memoizedState;return l===null?(Lt(t),n!==null?(pl(t),L0(t,n)):(pl(t),bi(t,u,null,a,e))):n?n!==l.memoizedState?(Lt(t),pl(t),L0(t,n)):(pl(t),t.flags&=-16777217):(l=l.memoizedProps,l!==a&&Lt(t),pl(t),bi(t,u,l,a,e)),null;case 27:if(Tu(t),e=K.current,u=t.type,l!==null&&t.stateNode!=null)l.memoizedProps!==a&&Lt(t);else{if(!a){if(t.stateNode===null)throw Error(o(166));return pl(t),null}l=N.current,ua(t)?As(t):(l=Fd(u,a,e),t.stateNode=l,Lt(t))}return pl(t),null;case 5:if(Tu(t),u=t.type,l!==null&&t.stateNode!=null)l.memoizedProps!==a&&Lt(t);else{if(!a){if(t.stateNode===null)throw Error(o(166));return pl(t),null}if(n=N.current,ua(t))As(t);else{var c=Mn(K.current);switch(n){case 1:n=c.createElementNS("http://www.w3.org/2000/svg",u);break;case 2:n=c.createElementNS("http://www.w3.org/1998/Math/MathML",u);break;default:switch(u){case"svg":n=c.createElementNS("http://www.w3.org/2000/svg",u);break;case"math":n=c.createElementNS("http://www.w3.org/1998/Math/MathML",u);break;case"script":n=c.createElement("div"),n.innerHTML="<script><\/script>",n=n.removeChild(n.firstChild);break;case"select":n=typeof a.is=="string"?c.createElement("select",{is:a.is}):c.createElement("select"),a.multiple?n.multiple=!0:a.size&&(n.size=a.size);break;default:n=typeof a.is=="string"?c.createElement(u,{is:a.is}):c.createElement(u)}}n[Ql]=t,n[Fl]=a;l:for(c=t.child;c!==null;){if(c.tag===5||c.tag===6)n.appendChild(c.stateNode);else if(c.tag!==4&&c.tag!==27&&c.child!==null){c.child.return=c,c=c.child;continue}if(c===t)break l;for(;c.sibling===null;){if(c.return===null||c.return===t)break l;c=c.return}c.sibling.return=c.return,c=c.sibling}t.stateNode=n;l:switch(Ll(n,u,a),u){case"button":case"input":case"select":case"textarea":a=!!a.autoFocus;break l;case"img":a=!0;break l;default:a=!1}a&&Lt(t)}}return pl(t),bi(t,t.type,l===null?null:l.memoizedProps,t.pendingProps,e),null;case 6:if(l&&t.stateNode!=null)l.memoizedProps!==a&&Lt(t);else{if(typeof a!="string"&&t.stateNode===null)throw Error(o(166));if(l=K.current,ua(t)){if(l=t.stateNode,e=t.memoizedProps,a=null,u=Xl,u!==null)switch(u.tag){case 27:case 5:a=u.memoizedProps}l[Ql]=t,l=!!(l.nodeValue===e||a!==null&&a.suppressHydrationWarning===!0||Qd(l.nodeValue,e)),l||ae(t,!0)}else l=Mn(l).createTextNode(a),l[Ql]=t,t.stateNode=l}return pl(t),null;case 31:if(e=t.memoizedState,l===null||l.memoizedState!==null){if(a=ua(t),e!==null){if(l===null){if(!a)throw Error(o(318));if(l=t.memoizedState,l=l!==null?l.dehydrated:null,!l)throw Error(o(557));l[Ql]=t}else Ue(),(t.flags&128)===0&&(t.memoizedState=null),t.flags|=4;pl(t),l=!1}else e=Dc(),l!==null&&l.memoizedState!==null&&(l.memoizedState.hydrationErrors=e),l=!0;if(!l)return t.flags&256?(ot(t),t):(ot(t),null);if((t.flags&128)!==0)throw Error(o(558))}return pl(t),null;case 13:if(a=t.memoizedState,l===null||l.memoizedState!==null&&l.memoizedState.dehydrated!==null){if(u=ua(t),a!==null&&a.dehydrated!==null){if(l===null){if(!u)throw Error(o(318));if(u=t.memoizedState,u=u!==null?u.dehydrated:null,!u)throw Error(o(317));u[Ql]=t}else Ue(),(t.flags&128)===0&&(t.memoizedState=null),t.flags|=4;pl(t),u=!1}else u=Dc(),l!==null&&l.memoizedState!==null&&(l.memoizedState.hydrationErrors=u),u=!0;if(!u)return t.flags&256?(ot(t),t):(ot(t),null)}return ot(t),(t.flags&128)!==0?(t.lanes=e,t):(e=a!==null,l=l!==null&&l.memoizedState!==null,e&&(a=t.child,u=null,a.alternate!==null&&a.alternate.memoizedState!==null&&a.alternate.memoizedState.cachePool!==null&&(u=a.alternate.memoizedState.cachePool.pool),n=null,a.memoizedState!==null&&a.memoizedState.cachePool!==null&&(n=a.memoizedState.cachePool.pool),n!==u&&(a.flags|=2048)),e!==l&&e&&(t.child.flags|=8192),mn(t,t.updateQueue),pl(t),null);case 4:return xl(),l===null&&Xi(t.stateNode.containerInfo),pl(t),null;case 10:return Qt(t.type),pl(t),null;case 19:if(g(Dl),a=t.memoizedState,a===null)return pl(t),null;if(u=(t.flags&128)!==0,n=a.rendering,n===null)if(u)tu(a,!1);else{if(Ml!==0||l!==null&&(l.flags&128)!==0)for(l=t.child;l!==null;){if(n=Iu(l),n!==null){for(t.flags|=128,tu(a,!1),l=n.updateQueue,t.updateQueue=l,mn(t,l),t.subtreeFlags=0,l=e,e=t.child;e!==null;)Ss(e,l),e=e.sibling;return O(Dl,Dl.current&1|2),el&&Yt(t,a.treeForkCount),t.child}l=l.sibling}a.tail!==null&&nt()>gn&&(t.flags|=128,u=!0,tu(a,!1),t.lanes=4194304)}else{if(!u)if(l=Iu(n),l!==null){if(t.flags|=128,u=!0,l=l.updateQueue,t.updateQueue=l,mn(t,l),tu(a,!0),a.tail===null&&a.tailMode==="hidden"&&!n.alternate&&!el)return pl(t),null}else 2*nt()-a.renderingStartTime>gn&&e!==536870912&&(t.flags|=128,u=!0,tu(a,!1),t.lanes=4194304);a.isBackwards?(n.sibling=t.child,t.child=n):(l=a.last,l!==null?l.sibling=n:t.child=n,a.last=n)}return a.tail!==null?(l=a.tail,a.rendering=l,a.tail=l.sibling,a.renderingStartTime=nt(),l.sibling=null,e=Dl.current,O(Dl,u?e&1|2:e&1),el&&Yt(t,a.treeForkCount),l):(pl(t),null);case 22:case 23:return ot(t),Zc(),a=t.memoizedState!==null,l!==null?l.memoizedState!==null!==a&&(t.flags|=8192):a&&(t.flags|=8192),a?(e&536870912)!==0&&(t.flags&128)===0&&(pl(t),t.subtreeFlags&6&&(t.flags|=8192)):pl(t),e=t.updateQueue,e!==null&&mn(t,e.retryQueue),e=null,l!==null&&l.memoizedState!==null&&l.memoizedState.cachePool!==null&&(e=l.memoizedState.cachePool.pool),a=null,t.memoizedState!==null&&t.memoizedState.cachePool!==null&&(a=t.memoizedState.cachePool.pool),a!==e&&(t.flags|=2048),l!==null&&g(je),null;case 24:return e=null,l!==null&&(e=l.memoizedState.cache),t.memoizedState.cache!==e&&(t.flags|=2048),Qt(Ul),pl(t),null;case 25:return null;case 30:return null}throw Error(o(156,t.tag))}function cy(l,t){switch(Mc(t),t.tag){case 1:return l=t.flags,l&65536?(t.flags=l&-65537|128,t):null;case 3:return Qt(Ul),xl(),l=t.flags,(l&65536)!==0&&(l&128)===0?(t.flags=l&-65537|128,t):null;case 26:case 27:case 5:return Tu(t),null;case 31:if(t.memoizedState!==null){if(ot(t),t.alternate===null)throw Error(o(340));Ue()}return l=t.flags,l&65536?(t.flags=l&-65537|128,t):null;case 13:if(ot(t),l=t.memoizedState,l!==null&&l.dehydrated!==null){if(t.alternate===null)throw Error(o(340));Ue()}return l=t.flags,l&65536?(t.flags=l&-65537|128,t):null;case 19:return g(Dl),null;case 4:return xl(),null;case 10:return Qt(t.type),null;case 22:case 23:return ot(t),Zc(),l!==null&&g(je),l=t.flags,l&65536?(t.flags=l&-65537|128,t):null;case 24:return Qt(Ul),null;case 25:return null;default:return null}}function K0(l,t){switch(Mc(t),t.tag){case 3:Qt(Ul),xl();break;case 26:case 27:case 5:Tu(t);break;case 4:xl();break;case 31:t.memoizedState!==null&&ot(t);break;case 13:ot(t);break;case 19:g(Dl);break;case 10:Qt(t.type);break;case 22:case 23:ot(t),Zc(),l!==null&&g(je);break;case 24:Qt(Ul)}}function eu(l,t){try{var e=t.updateQueue,a=e!==null?e.lastEffect:null;if(a!==null){var u=a.next;e=u;do{if((e.tag&l)===l){a=void 0;var n=e.create,c=e.inst;a=n(),c.destroy=a}e=e.next}while(e!==u)}}catch(i){hl(t,t.return,i)}}function de(l,t,e){try{var a=t.updateQueue,u=a!==null?a.lastEffect:null;if(u!==null){var n=u.next;a=n;do{if((a.tag&l)===l){var c=a.inst,i=c.destroy;if(i!==void 0){c.destroy=void 0,u=t;var f=e,r=i;try{r()}catch(b){hl(u,f,b)}}}a=a.next}while(a!==n)}}catch(b){hl(t,t.return,b)}}function J0(l){var t=l.updateQueue;if(t!==null){var e=l.stateNode;try{Bs(t,e)}catch(a){hl(l,l.return,a)}}}function w0(l,t,e){e.props=Ge(l.type,l.memoizedProps),e.state=l.memoizedState;try{e.componentWillUnmount()}catch(a){hl(l,t,a)}}function au(l,t){try{var e=l.ref;if(e!==null){switch(l.tag){case 26:case 27:case 5:var a=l.stateNode;break;case 30:a=l.stateNode;break;default:a=l.stateNode}typeof e=="function"?l.refCleanup=e(a):e.current=a}}catch(u){hl(l,t,u)}}function Ut(l,t){var e=l.ref,a=l.refCleanup;if(e!==null)if(typeof a=="function")try{a()}catch(u){hl(l,t,u)}finally{l.refCleanup=null,l=l.alternate,l!=null&&(l.refCleanup=null)}else if(typeof e=="function")try{e(null)}catch(u){hl(l,t,u)}else e.current=null}function W0(l){var t=l.type,e=l.memoizedProps,a=l.stateNode;try{l:switch(t){case"button":case"input":case"select":case"textarea":e.autoFocus&&a.focus();break l;case"img":e.src?a.src=e.src:e.srcSet&&(a.srcset=e.srcSet)}}catch(u){hl(l,l.return,u)}}function zi(l,t,e){try{var a=l.stateNode;xy(a,l.type,e,t),a[Fl]=t}catch(u){hl(l,l.return,u)}}function $0(l){return l.tag===5||l.tag===3||l.tag===26||l.tag===27&&ge(l.type)||l.tag===4}function Ei(l){l:for(;;){for(;l.sibling===null;){if(l.return===null||$0(l.return))return null;l=l.return}for(l.sibling.return=l.return,l=l.sibling;l.tag!==5&&l.tag!==6&&l.tag!==18;){if(l.tag===27&&ge(l.type)||l.flags&2||l.child===null||l.tag===4)continue l;l.child.return=l,l=l.child}if(!(l.flags&2))return l.stateNode}}function Ti(l,t,e){var a=l.tag;if(a===5||a===6)l=l.stateNode,t?(e.nodeType===9?e.body:e.nodeName==="HTML"?e.ownerDocument.body:e).insertBefore(l,t):(t=e.nodeType===9?e.body:e.nodeName==="HTML"?e.ownerDocument.body:e,t.appendChild(l),e=e._reactRootContainer,e!=null||t.onclick!==null||(t.onclick=Ht));else if(a!==4&&(a===27&&ge(l.type)&&(e=l.stateNode,t=null),l=l.child,l!==null))for(Ti(l,t,e),l=l.sibling;l!==null;)Ti(l,t,e),l=l.sibling}function yn(l,t,e){var a=l.tag;if(a===5||a===6)l=l.stateNode,t?e.insertBefore(l,t):e.appendChild(l);else if(a!==4&&(a===27&&ge(l.type)&&(e=l.stateNode),l=l.child,l!==null))for(yn(l,t,e),l=l.sibling;l!==null;)yn(l,t,e),l=l.sibling}function k0(l){var t=l.stateNode,e=l.memoizedProps;try{for(var a=l.type,u=t.attributes;u.length;)t.removeAttributeNode(u[0]);Ll(t,a,e),t[Ql]=l,t[Fl]=e}catch(n){hl(l,l.return,n)}}var Kt=!1,jl=!1,Ai=!1,F0=typeof WeakSet=="function"?WeakSet:Set,Gl=null;function iy(l,t){if(l=l.containerInfo,Li=jn,l=ss(l),vc(l)){if("selectionStart"in l)var e={start:l.selectionStart,end:l.selectionEnd};else l:{e=(e=l.ownerDocument)&&e.defaultView||window;var a=e.getSelection&&e.getSelection();if(a&&a.rangeCount!==0){e=a.anchorNode;var u=a.anchorOffset,n=a.focusNode;a=a.focusOffset;try{e.nodeType,n.nodeType}catch{e=null;break l}var c=0,i=-1,f=-1,r=0,b=0,A=l,v=null;t:for(;;){for(var S;A!==e||u!==0&&A.nodeType!==3||(i=c+u),A!==n||a!==0&&A.nodeType!==3||(f=c+a),A.nodeType===3&&(c+=A.nodeValue.length),(S=A.firstChild)!==null;)v=A,A=S;for(;;){if(A===l)break t;if(v===e&&++r===u&&(i=c),v===n&&++b===a&&(f=c),(S=A.nextSibling)!==null)break;A=v,v=A.parentNode}A=S}e=i===-1||f===-1?null:{start:i,end:f}}else e=null}e=e||{start:0,end:0}}else e=null;for(Ki={focusedElem:l,selectionRange:e},jn=!1,Gl=t;Gl!==null;)if(t=Gl,l=t.child,(t.subtreeFlags&1028)!==0&&l!==null)l.return=t,Gl=l;else for(;Gl!==null;){switch(t=Gl,n=t.alternate,l=t.flags,t.tag){case 0:if((l&4)!==0&&(l=t.updateQueue,l=l!==null?l.events:null,l!==null))for(e=0;e<l.length;e++)u=l[e],u.ref.impl=u.nextImpl;break;case 11:case 15:break;case 1:if((l&1024)!==0&&n!==null){l=void 0,e=t,u=n.memoizedProps,n=n.memoizedState,a=e.stateNode;try{var j=Ge(e.type,u);l=a.getSnapshotBeforeUpdate(j,n),a.__reactInternalSnapshotBeforeUpdate=l}catch(Z){hl(e,e.return,Z)}}break;case 3:if((l&1024)!==0){if(l=t.stateNode.containerInfo,e=l.nodeType,e===9)Wi(l);else if(e===1)switch(l.nodeName){case"HEAD":case"HTML":case"BODY":Wi(l);break;default:l.textContent=""}}break;case 5:case 26:case 27:case 6:case 4:case 17:break;default:if((l&1024)!==0)throw Error(o(163))}if(l=t.sibling,l!==null){l.return=t.return,Gl=l;break}Gl=t.return}}function I0(l,t,e){var a=e.flags;switch(e.tag){case 0:case 11:case 15:wt(l,e),a&4&&eu(5,e);break;case 1:if(wt(l,e),a&4)if(l=e.stateNode,t===null)try{l.componentDidMount()}catch(c){hl(e,e.return,c)}else{var u=Ge(e.type,t.memoizedProps);t=t.memoizedState;try{l.componentDidUpdate(u,t,l.__reactInternalSnapshotBeforeUpdate)}catch(c){hl(e,e.return,c)}}a&64&&J0(e),a&512&&au(e,e.return);break;case 3:if(wt(l,e),a&64&&(l=e.updateQueue,l!==null)){if(t=null,e.child!==null)switch(e.child.tag){case 27:case 5:t=e.child.stateNode;break;case 1:t=e.child.stateNode}try{Bs(l,t)}catch(c){hl(e,e.return,c)}}break;case 27:t===null&&a&4&&k0(e);case 26:case 5:wt(l,e),t===null&&a&4&&W0(e),a&512&&au(e,e.return);break;case 12:wt(l,e);break;case 31:wt(l,e),a&4&&td(l,e);break;case 13:wt(l,e),a&4&&ed(l,e),a&64&&(l=e.memoizedState,l!==null&&(l=l.dehydrated,l!==null&&(e=vy.bind(null,e),By(l,e))));break;case 22:if(a=e.memoizedState!==null||Kt,!a){t=t!==null&&t.memoizedState!==null||jl,u=Kt;var n=jl;Kt=a,(jl=t)&&!n?Wt(l,e,(e.subtreeFlags&8772)!==0):wt(l,e),Kt=u,jl=n}break;case 30:break;default:wt(l,e)}}function P0(l){var t=l.alternate;t!==null&&(l.alternate=null,P0(t)),l.child=null,l.deletions=null,l.sibling=null,l.tag===5&&(t=l.stateNode,t!==null&&Pn(t)),l.stateNode=null,l.return=null,l.dependencies=null,l.memoizedProps=null,l.memoizedState=null,l.pendingProps=null,l.stateNode=null,l.updateQueue=null}var _l=null,Pl=!1;function Jt(l,t,e){for(e=e.child;e!==null;)ld(l,t,e),e=e.sibling}function ld(l,t,e){if(ct&&typeof ct.onCommitFiberUnmount=="function")try{ct.onCommitFiberUnmount(Ma,e)}catch{}switch(e.tag){case 26:jl||Ut(e,t),Jt(l,t,e),e.memoizedState?e.memoizedState.count--:e.stateNode&&(e=e.stateNode,e.parentNode.removeChild(e));break;case 27:jl||Ut(e,t);var a=_l,u=Pl;ge(e.type)&&(_l=e.stateNode,Pl=!1),Jt(l,t,e),mu(e.stateNode),_l=a,Pl=u;break;case 5:jl||Ut(e,t);case 6:if(a=_l,u=Pl,_l=null,Jt(l,t,e),_l=a,Pl=u,_l!==null)if(Pl)try{(_l.nodeType===9?_l.body:_l.nodeName==="HTML"?_l.ownerDocument.body:_l).removeChild(e.stateNode)}catch(n){hl(e,t,n)}else try{_l.removeChild(e.stateNode)}catch(n){hl(e,t,n)}break;case 18:_l!==null&&(Pl?(l=_l,Jd(l.nodeType===9?l.body:l.nodeName==="HTML"?l.ownerDocument.body:l,e.stateNode),_a(l)):Jd(_l,e.stateNode));break;case 4:a=_l,u=Pl,_l=e.stateNode.containerInfo,Pl=!0,Jt(l,t,e),_l=a,Pl=u;break;case 0:case 11:case 14:case 15:de(2,e,t),jl||de(4,e,t),Jt(l,t,e);break;case 1:jl||(Ut(e,t),a=e.stateNode,typeof a.componentWillUnmount=="function"&&w0(e,t,a)),Jt(l,t,e);break;case 21:Jt(l,t,e);break;case 22:jl=(a=jl)||e.memoizedState!==null,Jt(l,t,e),jl=a;break;default:Jt(l,t,e)}}function td(l,t){if(t.memoizedState===null&&(l=t.alternate,l!==null&&(l=l.memoizedState,l!==null))){l=l.dehydrated;try{_a(l)}catch(e){hl(t,t.return,e)}}}function ed(l,t){if(t.memoizedState===null&&(l=t.alternate,l!==null&&(l=l.memoizedState,l!==null&&(l=l.dehydrated,l!==null))))try{_a(l)}catch(e){hl(t,t.return,e)}}function fy(l){switch(l.tag){case 31:case 13:case 19:var t=l.stateNode;return t===null&&(t=l.stateNode=new F0),t;case 22:return l=l.stateNode,t=l._retryCache,t===null&&(t=l._retryCache=new F0),t;default:throw Error(o(435,l.tag))}}function rn(l,t){var e=fy(l);t.forEach(function(a){if(!e.has(a)){e.add(a);var u=gy.bind(null,l,a);a.then(u,u)}})}function lt(l,t){var e=t.deletions;if(e!==null)for(var a=0;a<e.length;a++){var u=e[a],n=l,c=t,i=c;l:for(;i!==null;){switch(i.tag){case 27:if(ge(i.type)){_l=i.stateNode,Pl=!1;break l}break;case 5:_l=i.stateNode,Pl=!1;break l;case 3:case 4:_l=i.stateNode.containerInfo,Pl=!0;break l}i=i.return}if(_l===null)throw Error(o(160));ld(n,c,u),_l=null,Pl=!1,n=u.alternate,n!==null&&(n.return=null),u.return=null}if(t.subtreeFlags&13886)for(t=t.child;t!==null;)ad(t,l),t=t.sibling}var Mt=null;function ad(l,t){var e=l.alternate,a=l.flags;switch(l.tag){case 0:case 11:case 14:case 15:lt(t,l),tt(l),a&4&&(de(3,l,l.return),eu(3,l),de(5,l,l.return));break;case 1:lt(t,l),tt(l),a&512&&(jl||e===null||Ut(e,e.return)),a&64&&Kt&&(l=l.updateQueue,l!==null&&(a=l.callbacks,a!==null&&(e=l.shared.hiddenCallbacks,l.shared.hiddenCallbacks=e===null?a:e.concat(a))));break;case 26:var u=Mt;if(lt(t,l),tt(l),a&512&&(jl||e===null||Ut(e,e.return)),a&4){var n=e!==null?e.memoizedState:null;if(a=l.memoizedState,e===null)if(a===null)if(l.stateNode===null){l:{a=l.type,e=l.memoizedProps,u=u.ownerDocument||u;t:switch(a){case"title":n=u.getElementsByTagName("title")[0],(!n||n[Na]||n[Ql]||n.namespaceURI==="http://www.w3.org/2000/svg"||n.hasAttribute("itemprop"))&&(n=u.createElement(a),u.head.insertBefore(n,u.querySelector("head > title"))),Ll(n,a,e),n[Ql]=l,Yl(n),a=n;break l;case"link":var c=ao("link","href",u).get(a+(e.href||""));if(c){for(var i=0;i<c.length;i++)if(n=c[i],n.getAttribute("href")===(e.href==null||e.href===""?null:e.href)&&n.getAttribute("rel")===(e.rel==null?null:e.rel)&&n.getAttribute("title")===(e.title==null?null:e.title)&&n.getAttribute("crossorigin")===(e.crossOrigin==null?null:e.crossOrigin)){c.splice(i,1);break t}}n=u.createElement(a),Ll(n,a,e),u.head.appendChild(n);break;case"meta":if(c=ao("meta","content",u).get(a+(e.content||""))){for(i=0;i<c.length;i++)if(n=c[i],n.getAttribute("content")===(e.content==null?null:""+e.content)&&n.getAttribute("name")===(e.name==null?null:e.name)&&n.getAttribute("property")===(e.property==null?null:e.property)&&n.getAttribute("http-equiv")===(e.httpEquiv==null?null:e.httpEquiv)&&n.getAttribute("charset")===(e.charSet==null?null:e.charSet)){c.splice(i,1);break t}}n=u.createElement(a),Ll(n,a,e),u.head.appendChild(n);break;default:throw Error(o(468,a))}n[Ql]=l,Yl(n),a=n}l.stateNode=a}else uo(u,l.type,l.stateNode);else l.stateNode=eo(u,a,l.memoizedProps);else n!==a?(n===null?e.stateNode!==null&&(e=e.stateNode,e.parentNode.removeChild(e)):n.count--,a===null?uo(u,l.type,l.stateNode):eo(u,a,l.memoizedProps)):a===null&&l.stateNode!==null&&zi(l,l.memoizedProps,e.memoizedProps)}break;case 27:lt(t,l),tt(l),a&512&&(jl||e===null||Ut(e,e.return)),e!==null&&a&4&&zi(l,l.memoizedProps,e.memoizedProps);break;case 5:if(lt(t,l),tt(l),a&512&&(jl||e===null||Ut(e,e.return)),l.flags&32){u=l.stateNode;try{We(u,"")}catch(j){hl(l,l.return,j)}}a&4&&l.stateNode!=null&&(u=l.memoizedProps,zi(l,u,e!==null?e.memoizedProps:u)),a&1024&&(Ai=!0);break;case 6:if(lt(t,l),tt(l),a&4){if(l.stateNode===null)throw Error(o(162));a=l.memoizedProps,e=l.stateNode;try{e.nodeValue=a}catch(j){hl(l,l.return,j)}}break;case 3:if(Nn=null,u=Mt,Mt=xn(t.containerInfo),lt(t,l),Mt=u,tt(l),a&4&&e!==null&&e.memoizedState.isDehydrated)try{_a(t.containerInfo)}catch(j){hl(l,l.return,j)}Ai&&(Ai=!1,ud(l));break;case 4:a=Mt,Mt=xn(l.stateNode.containerInfo),lt(t,l),tt(l),Mt=a;break;case 12:lt(t,l),tt(l);break;case 31:lt(t,l),tt(l),a&4&&(a=l.updateQueue,a!==null&&(l.updateQueue=null,rn(l,a)));break;case 13:lt(t,l),tt(l),l.child.flags&8192&&l.memoizedState!==null!=(e!==null&&e.memoizedState!==null)&&(vn=nt()),a&4&&(a=l.updateQueue,a!==null&&(l.updateQueue=null,rn(l,a)));break;case 22:u=l.memoizedState!==null;var f=e!==null&&e.memoizedState!==null,r=Kt,b=jl;if(Kt=r||u,jl=b||f,lt(t,l),jl=b,Kt=r,tt(l),a&8192)l:for(t=l.stateNode,t._visibility=u?t._visibility&-2:t._visibility|1,u&&(e===null||f||Kt||jl||Qe(l)),e=null,t=l;;){if(t.tag===5||t.tag===26){if(e===null){f=e=t;try{if(n=f.stateNode,u)c=n.style,typeof c.setProperty=="function"?c.setProperty("display","none","important"):c.display="none";else{i=f.stateNode;var A=f.memoizedProps.style,v=A!=null&&A.hasOwnProperty("display")?A.display:null;i.style.display=v==null||typeof v=="boolean"?"":(""+v).trim()}}catch(j){hl(f,f.return,j)}}}else if(t.tag===6){if(e===null){f=t;try{f.stateNode.nodeValue=u?"":f.memoizedProps}catch(j){hl(f,f.return,j)}}}else if(t.tag===18){if(e===null){f=t;try{var S=f.stateNode;u?wd(S,!0):wd(f.stateNode,!1)}catch(j){hl(f,f.return,j)}}}else if((t.tag!==22&&t.tag!==23||t.memoizedState===null||t===l)&&t.child!==null){t.child.return=t,t=t.child;continue}if(t===l)break l;for(;t.sibling===null;){if(t.return===null||t.return===l)break l;e===t&&(e=null),t=t.return}e===t&&(e=null),t.sibling.return=t.return,t=t.sibling}a&4&&(a=l.updateQueue,a!==null&&(e=a.retryQueue,e!==null&&(a.retryQueue=null,rn(l,e))));break;case 19:lt(t,l),tt(l),a&4&&(a=l.updateQueue,a!==null&&(l.updateQueue=null,rn(l,a)));break;case 30:break;case 21:break;default:lt(t,l),tt(l)}}function tt(l){var t=l.flags;if(t&2){try{for(var e,a=l.return;a!==null;){if($0(a)){e=a;break}a=a.return}if(e==null)throw Error(o(160));switch(e.tag){case 27:var u=e.stateNode,n=Ei(l);yn(l,n,u);break;case 5:var c=e.stateNode;e.flags&32&&(We(c,""),e.flags&=-33);var i=Ei(l);yn(l,i,c);break;case 3:case 4:var f=e.stateNode.containerInfo,r=Ei(l);Ti(l,r,f);break;default:throw Error(o(161))}}catch(b){hl(l,l.return,b)}l.flags&=-3}t&4096&&(l.flags&=-4097)}function ud(l){if(l.subtreeFlags&1024)for(l=l.child;l!==null;){var t=l;ud(t),t.tag===5&&t.flags&1024&&t.stateNode.reset(),l=l.sibling}}function wt(l,t){if(t.subtreeFlags&8772)for(t=t.child;t!==null;)I0(l,t.alternate,t),t=t.sibling}function Qe(l){for(l=l.child;l!==null;){var t=l;switch(t.tag){case 0:case 11:case 14:case 15:de(4,t,t.return),Qe(t);break;case 1:Ut(t,t.return);var e=t.stateNode;typeof e.componentWillUnmount=="function"&&w0(t,t.return,e),Qe(t);break;case 27:mu(t.stateNode);case 26:case 5:Ut(t,t.return),Qe(t);break;case 22:t.memoizedState===null&&Qe(t);break;case 30:Qe(t);break;default:Qe(t)}l=l.sibling}}function Wt(l,t,e){for(e=e&&(t.subtreeFlags&8772)!==0,t=t.child;t!==null;){var a=t.alternate,u=l,n=t,c=n.flags;switch(n.tag){case 0:case 11:case 15:Wt(u,n,e),eu(4,n);break;case 1:if(Wt(u,n,e),a=n,u=a.stateNode,typeof u.componentDidMount=="function")try{u.componentDidMount()}catch(r){hl(a,a.return,r)}if(a=n,u=a.updateQueue,u!==null){var i=a.stateNode;try{var f=u.shared.hiddenCallbacks;if(f!==null)for(u.shared.hiddenCallbacks=null,u=0;u<f.length;u++)Hs(f[u],i)}catch(r){hl(a,a.return,r)}}e&&c&64&&J0(n),au(n,n.return);break;case 27:k0(n);case 26:case 5:Wt(u,n,e),e&&a===null&&c&4&&W0(n),au(n,n.return);break;case 12:Wt(u,n,e);break;case 31:Wt(u,n,e),e&&c&4&&td(u,n);break;case 13:Wt(u,n,e),e&&c&4&&ed(u,n);break;case 22:n.memoizedState===null&&Wt(u,n,e),au(n,n.return);break;case 30:break;default:Wt(u,n,e)}t=t.sibling}}function pi(l,t){var e=null;l!==null&&l.memoizedState!==null&&l.memoizedState.cachePool!==null&&(e=l.memoizedState.cachePool.pool),l=null,t.memoizedState!==null&&t.memoizedState.cachePool!==null&&(l=t.memoizedState.cachePool.pool),l!==e&&(l!=null&&l.refCount++,e!=null&&Va(e))}function _i(l,t){l=null,t.alternate!==null&&(l=t.alternate.memoizedState.cache),t=t.memoizedState.cache,t!==l&&(t.refCount++,l!=null&&Va(l))}function xt(l,t,e,a){if(t.subtreeFlags&10256)for(t=t.child;t!==null;)nd(l,t,e,a),t=t.sibling}function nd(l,t,e,a){var u=t.flags;switch(t.tag){case 0:case 11:case 15:xt(l,t,e,a),u&2048&&eu(9,t);break;case 1:xt(l,t,e,a);break;case 3:xt(l,t,e,a),u&2048&&(l=null,t.alternate!==null&&(l=t.alternate.memoizedState.cache),t=t.memoizedState.cache,t!==l&&(t.refCount++,l!=null&&Va(l)));break;case 12:if(u&2048){xt(l,t,e,a),l=t.stateNode;try{var n=t.memoizedProps,c=n.id,i=n.onPostCommit;typeof i=="function"&&i(c,t.alternate===null?"mount":"update",l.passiveEffectDuration,-0)}catch(f){hl(t,t.return,f)}}else xt(l,t,e,a);break;case 31:xt(l,t,e,a);break;case 13:xt(l,t,e,a);break;case 23:break;case 22:n=t.stateNode,c=t.alternate,t.memoizedState!==null?n._visibility&2?xt(l,t,e,a):uu(l,t):n._visibility&2?xt(l,t,e,a):(n._visibility|=2,ra(l,t,e,a,(t.subtreeFlags&10256)!==0||!1)),u&2048&&pi(c,t);break;case 24:xt(l,t,e,a),u&2048&&_i(t.alternate,t);break;default:xt(l,t,e,a)}}function ra(l,t,e,a,u){for(u=u&&((t.subtreeFlags&10256)!==0||!1),t=t.child;t!==null;){var n=l,c=t,i=e,f=a,r=c.flags;switch(c.tag){case 0:case 11:case 15:ra(n,c,i,f,u),eu(8,c);break;case 23:break;case 22:var b=c.stateNode;c.memoizedState!==null?b._visibility&2?ra(n,c,i,f,u):uu(n,c):(b._visibility|=2,ra(n,c,i,f,u)),u&&r&2048&&pi(c.alternate,c);break;case 24:ra(n,c,i,f,u),u&&r&2048&&_i(c.alternate,c);break;default:ra(n,c,i,f,u)}t=t.sibling}}function uu(l,t){if(t.subtreeFlags&10256)for(t=t.child;t!==null;){var e=l,a=t,u=a.flags;switch(a.tag){case 22:uu(e,a),u&2048&&pi(a.alternate,a);break;case 24:uu(e,a),u&2048&&_i(a.alternate,a);break;default:uu(e,a)}t=t.sibling}}var nu=8192;function ha(l,t,e){if(l.subtreeFlags&nu)for(l=l.child;l!==null;)cd(l,t,e),l=l.sibling}function cd(l,t,e){switch(l.tag){case 26:ha(l,t,e),l.flags&nu&&l.memoizedState!==null&&Wy(e,Mt,l.memoizedState,l.memoizedProps);break;case 5:ha(l,t,e);break;case 3:case 4:var a=Mt;Mt=xn(l.stateNode.containerInfo),ha(l,t,e),Mt=a;break;case 22:l.memoizedState===null&&(a=l.alternate,a!==null&&a.memoizedState!==null?(a=nu,nu=16777216,ha(l,t,e),nu=a):ha(l,t,e));break;default:ha(l,t,e)}}function id(l){var t=l.alternate;if(t!==null&&(l=t.child,l!==null)){t.child=null;do t=l.sibling,l.sibling=null,l=t;while(l!==null)}}function cu(l){var t=l.deletions;if((l.flags&16)!==0){if(t!==null)for(var e=0;e<t.length;e++){var a=t[e];Gl=a,sd(a,l)}id(l)}if(l.subtreeFlags&10256)for(l=l.child;l!==null;)fd(l),l=l.sibling}function fd(l){switch(l.tag){case 0:case 11:case 15:cu(l),l.flags&2048&&de(9,l,l.return);break;case 3:cu(l);break;case 12:cu(l);break;case 22:var t=l.stateNode;l.memoizedState!==null&&t._visibility&2&&(l.return===null||l.return.tag!==13)?(t._visibility&=-3,hn(l)):cu(l);break;default:cu(l)}}function hn(l){var t=l.deletions;if((l.flags&16)!==0){if(t!==null)for(var e=0;e<t.length;e++){var a=t[e];Gl=a,sd(a,l)}id(l)}for(l=l.child;l!==null;){switch(t=l,t.tag){case 0:case 11:case 15:de(8,t,t.return),hn(t);break;case 22:e=t.stateNode,e._visibility&2&&(e._visibility&=-3,hn(t));break;default:hn(t)}l=l.sibling}}function sd(l,t){for(;Gl!==null;){var e=Gl;switch(e.tag){case 0:case 11:case 15:de(8,e,t);break;case 23:case 22:if(e.memoizedState!==null&&e.memoizedState.cachePool!==null){var a=e.memoizedState.cachePool.pool;a!=null&&a.refCount++}break;case 24:Va(e.memoizedState.cache)}if(a=e.child,a!==null)a.return=e,Gl=a;else l:for(e=l;Gl!==null;){a=Gl;var u=a.sibling,n=a.return;if(P0(a),a===e){Gl=null;break l}if(u!==null){u.return=n,Gl=u;break l}Gl=n}}}var sy={getCacheForType:function(l){var t=Zl(Ul),e=t.data.get(l);return e===void 0&&(e=l(),t.data.set(l,e)),e},cacheSignal:function(){return Zl(Ul).controller.signal}},dy=typeof WeakMap=="function"?WeakMap:Map,sl=0,El=null,F=null,ll=0,rl=0,mt=null,oe=!1,va=!1,Oi=!1,$t=0,Ml=0,me=0,Xe=0,Mi=0,yt=0,ga=0,iu=null,et=null,xi=!1,vn=0,dd=0,gn=1/0,Sn=null,ye=null,ql=0,re=null,Sa=null,kt=0,Di=0,Ni=null,od=null,fu=0,Ui=null;function rt(){return(sl&2)!==0&&ll!==0?ll&-ll:z.T!==null?qi():Mf()}function md(){if(yt===0)if((ll&536870912)===0||el){var l=_u;_u<<=1,(_u&3932160)===0&&(_u=262144),yt=l}else yt=536870912;return l=dt.current,l!==null&&(l.flags|=32),yt}function at(l,t,e){(l===El&&(rl===2||rl===9)||l.cancelPendingCommit!==null)&&(ba(l,0),he(l,ll,yt,!1)),Da(l,e),((sl&2)===0||l!==El)&&(l===El&&((sl&2)===0&&(Xe|=e),Ml===4&&he(l,ll,yt,!1)),Ct(l))}function yd(l,t,e){if((sl&6)!==0)throw Error(o(327));var a=!e&&(t&127)===0&&(t&l.expiredLanes)===0||xa(l,t),u=a?yy(l,t):Ri(l,t,!0),n=a;do{if(u===0){va&&!a&&he(l,t,0,!1);break}else{if(e=l.current.alternate,n&&!oy(e)){u=Ri(l,t,!1),n=!1;continue}if(u===2){if(n=t,l.errorRecoveryDisabledLanes&n)var c=0;else c=l.pendingLanes&-536870913,c=c!==0?c:c&536870912?536870912:0;if(c!==0){t=c;l:{var i=l;u=iu;var f=i.current.memoizedState.isDehydrated;if(f&&(ba(i,c).flags|=256),c=Ri(i,c,!1),c!==2){if(Oi&&!f){i.errorRecoveryDisabledLanes|=n,Xe|=n,u=4;break l}n=et,et=u,n!==null&&(et===null?et=n:et.push.apply(et,n))}u=c}if(n=!1,u!==2)continue}}if(u===1){ba(l,0),he(l,t,0,!0);break}l:{switch(a=l,n=u,n){case 0:case 1:throw Error(o(345));case 4:if((t&4194048)!==t)break;case 6:he(a,t,yt,!oe);break l;case 2:et=null;break;case 3:case 5:break;default:throw Error(o(329))}if((t&62914560)===t&&(u=vn+300-nt(),10<u)){if(he(a,t,yt,!oe),Mu(a,0,!0)!==0)break l;kt=t,a.timeoutHandle=Ld(rd.bind(null,a,e,et,Sn,xi,t,yt,Xe,ga,oe,n,"Throttled",-0,0),u);break l}rd(a,e,et,Sn,xi,t,yt,Xe,ga,oe,n,null,-0,0)}}break}while(!0);Ct(l)}function rd(l,t,e,a,u,n,c,i,f,r,b,A,v,S){if(l.timeoutHandle=-1,A=t.subtreeFlags,A&8192||(A&16785408)===16785408){A={stylesheets:null,count:0,imgCount:0,imgBytes:0,suspenseyImages:[],waitingForImages:!0,waitingForViewTransition:!1,unsuspend:Ht},cd(t,n,A);var j=(n&62914560)===n?vn-nt():(n&4194048)===n?dd-nt():0;if(j=$y(A,j),j!==null){kt=n,l.cancelPendingCommit=j(Td.bind(null,l,t,n,e,a,u,c,i,f,b,A,null,v,S)),he(l,n,c,!r);return}}Td(l,t,n,e,a,u,c,i,f)}function oy(l){for(var t=l;;){var e=t.tag;if((e===0||e===11||e===15)&&t.flags&16384&&(e=t.updateQueue,e!==null&&(e=e.stores,e!==null)))for(var a=0;a<e.length;a++){var u=e[a],n=u.getSnapshot;u=u.value;try{if(!ft(n(),u))return!1}catch{return!1}}if(e=t.child,t.subtreeFlags&16384&&e!==null)e.return=t,t=e;else{if(t===l)break;for(;t.sibling===null;){if(t.return===null||t.return===l)return!0;t=t.return}t.sibling.return=t.return,t=t.sibling}}return!0}function he(l,t,e,a){t&=~Mi,t&=~Xe,l.suspendedLanes|=t,l.pingedLanes&=~t,a&&(l.warmLanes|=t),a=l.expirationTimes;for(var u=t;0<u;){var n=31-it(u),c=1<<n;a[n]=-1,u&=~c}e!==0&&pf(l,e,t)}function bn(){return(sl&6)===0?(su(0),!1):!0}function Ci(){if(F!==null){if(rl===0)var l=F.return;else l=F,Gt=Ce=null,Wc(l),sa=null,Ka=0,l=F;for(;l!==null;)K0(l.alternate,l),l=l.return;F=null}}function ba(l,t){var e=l.timeoutHandle;e!==-1&&(l.timeoutHandle=-1,Uy(e)),e=l.cancelPendingCommit,e!==null&&(l.cancelPendingCommit=null,e()),kt=0,Ci(),El=l,F=e=qt(l.current,null),ll=t,rl=0,mt=null,oe=!1,va=xa(l,t),Oi=!1,ga=yt=Mi=Xe=me=Ml=0,et=iu=null,xi=!1,(t&8)!==0&&(t|=t&32);var a=l.entangledLanes;if(a!==0)for(l=l.entanglements,a&=t;0<a;){var u=31-it(a),n=1<<u;t|=l[u],a&=~n}return $t=t,Gu(),e}function hd(l,t){J=null,z.H=Pa,t===fa||t===wu?(t=Us(),rl=3):t===Bc?(t=Us(),rl=4):rl=t===di?8:t!==null&&typeof t=="object"&&typeof t.then=="function"?6:1,mt=t,F===null&&(Ml=1,fn(l,St(t,l.current)))}function vd(){var l=dt.current;return l===null?!0:(ll&4194048)===ll?Tt===null:(ll&62914560)===ll||(ll&536870912)!==0?l===Tt:!1}function gd(){var l=z.H;return z.H=Pa,l===null?Pa:l}function Sd(){var l=z.A;return z.A=sy,l}function zn(){Ml=4,oe||(ll&4194048)!==ll&&dt.current!==null||(va=!0),(me&134217727)===0&&(Xe&134217727)===0||El===null||he(El,ll,yt,!1)}function Ri(l,t,e){var a=sl;sl|=2;var u=gd(),n=Sd();(El!==l||ll!==t)&&(Sn=null,ba(l,t)),t=!1;var c=Ml;l:do try{if(rl!==0&&F!==null){var i=F,f=mt;switch(rl){case 8:Ci(),c=6;break l;case 3:case 2:case 9:case 6:dt.current===null&&(t=!0);var r=rl;if(rl=0,mt=null,za(l,i,f,r),e&&va){c=0;break l}break;default:r=rl,rl=0,mt=null,za(l,i,f,r)}}my(),c=Ml;break}catch(b){hd(l,b)}while(!0);return t&&l.shellSuspendCounter++,Gt=Ce=null,sl=a,z.H=u,z.A=n,F===null&&(El=null,ll=0,Gu()),c}function my(){for(;F!==null;)bd(F)}function yy(l,t){var e=sl;sl|=2;var a=gd(),u=Sd();El!==l||ll!==t?(Sn=null,gn=nt()+500,ba(l,t)):va=xa(l,t);l:do try{if(rl!==0&&F!==null){t=F;var n=mt;t:switch(rl){case 1:rl=0,mt=null,za(l,t,n,1);break;case 2:case 9:if(Ds(n)){rl=0,mt=null,zd(t);break}t=function(){rl!==2&&rl!==9||El!==l||(rl=7),Ct(l)},n.then(t,t);break l;case 3:rl=7;break l;case 4:rl=5;break l;case 7:Ds(n)?(rl=0,mt=null,zd(t)):(rl=0,mt=null,za(l,t,n,7));break;case 5:var c=null;switch(F.tag){case 26:c=F.memoizedState;case 5:case 27:var i=F;if(c?no(c):i.stateNode.complete){rl=0,mt=null;var f=i.sibling;if(f!==null)F=f;else{var r=i.return;r!==null?(F=r,En(r)):F=null}break t}}rl=0,mt=null,za(l,t,n,5);break;case 6:rl=0,mt=null,za(l,t,n,6);break;case 8:Ci(),Ml=6;break l;default:throw Error(o(462))}}ry();break}catch(b){hd(l,b)}while(!0);return Gt=Ce=null,z.H=a,z.A=u,sl=e,F!==null?0:(El=null,ll=0,Gu(),Ml)}function ry(){for(;F!==null&&!qo();)bd(F)}function bd(l){var t=V0(l.alternate,l,$t);l.memoizedProps=l.pendingProps,t===null?En(l):F=t}function zd(l){var t=l,e=t.alternate;switch(t.tag){case 15:case 0:t=q0(e,t,t.pendingProps,t.type,void 0,ll);break;case 11:t=q0(e,t,t.pendingProps,t.type.render,t.ref,ll);break;case 5:Wc(t);default:K0(e,t),t=F=Ss(t,$t),t=V0(e,t,$t)}l.memoizedProps=l.pendingProps,t===null?En(l):F=t}function za(l,t,e,a){Gt=Ce=null,Wc(t),sa=null,Ka=0;var u=t.return;try{if(ey(l,u,t,e,ll)){Ml=1,fn(l,St(e,l.current)),F=null;return}}catch(n){if(u!==null)throw F=u,n;Ml=1,fn(l,St(e,l.current)),F=null;return}t.flags&32768?(el||a===1?l=!0:va||(ll&536870912)!==0?l=!1:(oe=l=!0,(a===2||a===9||a===3||a===6)&&(a=dt.current,a!==null&&a.tag===13&&(a.flags|=16384))),Ed(t,l)):En(t)}function En(l){var t=l;do{if((t.flags&32768)!==0){Ed(t,oe);return}l=t.return;var e=ny(t.alternate,t,$t);if(e!==null){F=e;return}if(t=t.sibling,t!==null){F=t;return}F=t=l}while(t!==null);Ml===0&&(Ml=5)}function Ed(l,t){do{var e=cy(l.alternate,l);if(e!==null){e.flags&=32767,F=e;return}if(e=l.return,e!==null&&(e.flags|=32768,e.subtreeFlags=0,e.deletions=null),!t&&(l=l.sibling,l!==null)){F=l;return}F=l=e}while(l!==null);Ml=6,F=null}function Td(l,t,e,a,u,n,c,i,f){l.cancelPendingCommit=null;do Tn();while(ql!==0);if((sl&6)!==0)throw Error(o(327));if(t!==null){if(t===l.current)throw Error(o(177));if(n=t.lanes|t.childLanes,n|=Ec,wo(l,e,n,c,i,f),l===El&&(F=El=null,ll=0),Sa=t,re=l,kt=e,Di=n,Ni=u,od=a,(t.subtreeFlags&10256)!==0||(t.flags&10256)!==0?(l.callbackNode=null,l.callbackPriority=0,Sy(Au,function(){return Md(),null})):(l.callbackNode=null,l.callbackPriority=0),a=(t.flags&13878)!==0,(t.subtreeFlags&13878)!==0||a){a=z.T,z.T=null,u=D.p,D.p=2,c=sl,sl|=4;try{iy(l,t,e)}finally{sl=c,D.p=u,z.T=a}}ql=1,Ad(),pd(),_d()}}function Ad(){if(ql===1){ql=0;var l=re,t=Sa,e=(t.flags&13878)!==0;if((t.subtreeFlags&13878)!==0||e){e=z.T,z.T=null;var a=D.p;D.p=2;var u=sl;sl|=4;try{ad(t,l);var n=Ki,c=ss(l.containerInfo),i=n.focusedElem,f=n.selectionRange;if(c!==i&&i&&i.ownerDocument&&fs(i.ownerDocument.documentElement,i)){if(f!==null&&vc(i)){var r=f.start,b=f.end;if(b===void 0&&(b=r),"selectionStart"in i)i.selectionStart=r,i.selectionEnd=Math.min(b,i.value.length);else{var A=i.ownerDocument||document,v=A&&A.defaultView||window;if(v.getSelection){var S=v.getSelection(),j=i.textContent.length,Z=Math.min(f.start,j),bl=f.end===void 0?Z:Math.min(f.end,j);!S.extend&&Z>bl&&(c=bl,bl=Z,Z=c);var m=is(i,Z),s=is(i,bl);if(m&&s&&(S.rangeCount!==1||S.anchorNode!==m.node||S.anchorOffset!==m.offset||S.focusNode!==s.node||S.focusOffset!==s.offset)){var y=A.createRange();y.setStart(m.node,m.offset),S.removeAllRanges(),Z>bl?(S.addRange(y),S.extend(s.node,s.offset)):(y.setEnd(s.node,s.offset),S.addRange(y))}}}}for(A=[],S=i;S=S.parentNode;)S.nodeType===1&&A.push({element:S,left:S.scrollLeft,top:S.scrollTop});for(typeof i.focus=="function"&&i.focus(),i=0;i<A.length;i++){var T=A[i];T.element.scrollLeft=T.left,T.element.scrollTop=T.top}}jn=!!Li,Ki=Li=null}finally{sl=u,D.p=a,z.T=e}}l.current=t,ql=2}}function pd(){if(ql===2){ql=0;var l=re,t=Sa,e=(t.flags&8772)!==0;if((t.subtreeFlags&8772)!==0||e){e=z.T,z.T=null;var a=D.p;D.p=2;var u=sl;sl|=4;try{I0(l,t.alternate,t)}finally{sl=u,D.p=a,z.T=e}}ql=3}}function _d(){if(ql===4||ql===3){ql=0,Yo();var l=re,t=Sa,e=kt,a=od;(t.subtreeFlags&10256)!==0||(t.flags&10256)!==0?ql=5:(ql=0,Sa=re=null,Od(l,l.pendingLanes));var u=l.pendingLanes;if(u===0&&(ye=null),Fn(e),t=t.stateNode,ct&&typeof ct.onCommitFiberRoot=="function")try{ct.onCommitFiberRoot(Ma,t,void 0,(t.current.flags&128)===128)}catch{}if(a!==null){t=z.T,u=D.p,D.p=2,z.T=null;try{for(var n=l.onRecoverableError,c=0;c<a.length;c++){var i=a[c];n(i.value,{componentStack:i.stack})}}finally{z.T=t,D.p=u}}(kt&3)!==0&&Tn(),Ct(l),u=l.pendingLanes,(e&261930)!==0&&(u&42)!==0?l===Ui?fu++:(fu=0,Ui=l):fu=0,su(0)}}function Od(l,t){(l.pooledCacheLanes&=t)===0&&(t=l.pooledCache,t!=null&&(l.pooledCache=null,Va(t)))}function Tn(){return Ad(),pd(),_d(),Md()}function Md(){if(ql!==5)return!1;var l=re,t=Di;Di=0;var e=Fn(kt),a=z.T,u=D.p;try{D.p=32>e?32:e,z.T=null,e=Ni,Ni=null;var n=re,c=kt;if(ql=0,Sa=re=null,kt=0,(sl&6)!==0)throw Error(o(331));var i=sl;if(sl|=4,fd(n.current),nd(n,n.current,c,e),sl=i,su(0,!1),ct&&typeof ct.onPostCommitFiberRoot=="function")try{ct.onPostCommitFiberRoot(Ma,n)}catch{}return!0}finally{D.p=u,z.T=a,Od(l,t)}}function xd(l,t,e){t=St(e,t),t=si(l.stateNode,t,2),l=ie(l,t,2),l!==null&&(Da(l,2),Ct(l))}function hl(l,t,e){if(l.tag===3)xd(l,l,e);else for(;t!==null;){if(t.tag===3){xd(t,l,e);break}else if(t.tag===1){var a=t.stateNode;if(typeof t.type.getDerivedStateFromError=="function"||typeof a.componentDidCatch=="function"&&(ye===null||!ye.has(a))){l=St(e,l),e=D0(2),a=ie(t,e,2),a!==null&&(N0(e,a,t,l),Da(a,2),Ct(a));break}}t=t.return}}function ji(l,t,e){var a=l.pingCache;if(a===null){a=l.pingCache=new dy;var u=new Set;a.set(t,u)}else u=a.get(t),u===void 0&&(u=new Set,a.set(t,u));u.has(e)||(Oi=!0,u.add(e),l=hy.bind(null,l,t,e),t.then(l,l))}function hy(l,t,e){var a=l.pingCache;a!==null&&a.delete(t),l.pingedLanes|=l.suspendedLanes&e,l.warmLanes&=~e,El===l&&(ll&e)===e&&(Ml===4||Ml===3&&(ll&62914560)===ll&&300>nt()-vn?(sl&2)===0&&ba(l,0):Mi|=e,ga===ll&&(ga=0)),Ct(l)}function Dd(l,t){t===0&&(t=Af()),l=De(l,t),l!==null&&(Da(l,t),Ct(l))}function vy(l){var t=l.memoizedState,e=0;t!==null&&(e=t.retryLane),Dd(l,e)}function gy(l,t){var e=0;switch(l.tag){case 31:case 13:var a=l.stateNode,u=l.memoizedState;u!==null&&(e=u.retryLane);break;case 19:a=l.stateNode;break;case 22:a=l.stateNode._retryCache;break;default:throw Error(o(314))}a!==null&&a.delete(t),Dd(l,e)}function Sy(l,t){return wn(l,t)}var An=null,Ea=null,Hi=!1,pn=!1,Bi=!1,ve=0;function Ct(l){l!==Ea&&l.next===null&&(Ea===null?An=Ea=l:Ea=Ea.next=l),pn=!0,Hi||(Hi=!0,zy())}function su(l,t){if(!Bi&&pn){Bi=!0;do for(var e=!1,a=An;a!==null;){if(l!==0){var u=a.pendingLanes;if(u===0)var n=0;else{var c=a.suspendedLanes,i=a.pingedLanes;n=(1<<31-it(42|l)+1)-1,n&=u&~(c&~i),n=n&201326741?n&201326741|1:n?n|2:0}n!==0&&(e=!0,Rd(a,n))}else n=ll,n=Mu(a,a===El?n:0,a.cancelPendingCommit!==null||a.timeoutHandle!==-1),(n&3)===0||xa(a,n)||(e=!0,Rd(a,n));a=a.next}while(e);Bi=!1}}function by(){Nd()}function Nd(){pn=Hi=!1;var l=0;ve!==0&&Ny()&&(l=ve);for(var t=nt(),e=null,a=An;a!==null;){var u=a.next,n=Ud(a,t);n===0?(a.next=null,e===null?An=u:e.next=u,u===null&&(Ea=e)):(e=a,(l!==0||(n&3)!==0)&&(pn=!0)),a=u}ql!==0&&ql!==5||su(l),ve!==0&&(ve=0)}function Ud(l,t){for(var e=l.suspendedLanes,a=l.pingedLanes,u=l.expirationTimes,n=l.pendingLanes&-62914561;0<n;){var c=31-it(n),i=1<<c,f=u[c];f===-1?((i&e)===0||(i&a)!==0)&&(u[c]=Jo(i,t)):f<=t&&(l.expiredLanes|=i),n&=~i}if(t=El,e=ll,e=Mu(l,l===t?e:0,l.cancelPendingCommit!==null||l.timeoutHandle!==-1),a=l.callbackNode,e===0||l===t&&(rl===2||rl===9)||l.cancelPendingCommit!==null)return a!==null&&a!==null&&Wn(a),l.callbackNode=null,l.callbackPriority=0;if((e&3)===0||xa(l,e)){if(t=e&-e,t===l.callbackPriority)return t;switch(a!==null&&Wn(a),Fn(e)){case 2:case 8:e=Ef;break;case 32:e=Au;break;case 268435456:e=Tf;break;default:e=Au}return a=Cd.bind(null,l),e=wn(e,a),l.callbackPriority=t,l.callbackNode=e,t}return a!==null&&a!==null&&Wn(a),l.callbackPriority=2,l.callbackNode=null,2}function Cd(l,t){if(ql!==0&&ql!==5)return l.callbackNode=null,l.callbackPriority=0,null;var e=l.callbackNode;if(Tn()&&l.callbackNode!==e)return null;var a=ll;return a=Mu(l,l===El?a:0,l.cancelPendingCommit!==null||l.timeoutHandle!==-1),a===0?null:(yd(l,a,t),Ud(l,nt()),l.callbackNode!=null&&l.callbackNode===e?Cd.bind(null,l):null)}function Rd(l,t){if(Tn())return null;yd(l,t,!0)}function zy(){Cy(function(){(sl&6)!==0?wn(zf,by):Nd()})}function qi(){if(ve===0){var l=ca;l===0&&(l=pu,pu<<=1,(pu&261888)===0&&(pu=256)),ve=l}return ve}function jd(l){return l==null||typeof l=="symbol"||typeof l=="boolean"?null:typeof l=="function"?l:Uu(""+l)}function Hd(l,t){var e=t.ownerDocument.createElement("input");return e.name=t.name,e.value=t.value,l.id&&e.setAttribute("form",l.id),t.parentNode.insertBefore(e,t),l=new FormData(l),e.parentNode.removeChild(e),l}function Ey(l,t,e,a,u){if(t==="submit"&&e&&e.stateNode===u){var n=jd((u[Fl]||null).action),c=a.submitter;c&&(t=(t=c[Fl]||null)?jd(t.formAction):c.getAttribute("formAction"),t!==null&&(n=t,c=null));var i=new Hu("action","action",null,a,u);l.push({event:i,listeners:[{instance:null,listener:function(){if(a.defaultPrevented){if(ve!==0){var f=c?Hd(u,c):new FormData(u);ai(e,{pending:!0,data:f,method:u.method,action:n},null,f)}}else typeof n=="function"&&(i.preventDefault(),f=c?Hd(u,c):new FormData(u),ai(e,{pending:!0,data:f,method:u.method,action:n},n,f))},currentTarget:u}]})}}for(var Yi=0;Yi<zc.length;Yi++){var Gi=zc[Yi],Ty=Gi.toLowerCase(),Ay=Gi[0].toUpperCase()+Gi.slice(1);Ot(Ty,"on"+Ay)}Ot(ms,"onAnimationEnd"),Ot(ys,"onAnimationIteration"),Ot(rs,"onAnimationStart"),Ot("dblclick","onDoubleClick"),Ot("focusin","onFocus"),Ot("focusout","onBlur"),Ot(Gm,"onTransitionRun"),Ot(Qm,"onTransitionStart"),Ot(Xm,"onTransitionCancel"),Ot(hs,"onTransitionEnd"),Je("onMouseEnter",["mouseout","mouseover"]),Je("onMouseLeave",["mouseout","mouseover"]),Je("onPointerEnter",["pointerout","pointerover"]),Je("onPointerLeave",["pointerout","pointerover"]),_e("onChange","change click focusin focusout input keydown keyup selectionchange".split(" ")),_e("onSelect","focusout contextmenu dragend focusin keydown keyup mousedown mouseup selectionchange".split(" ")),_e("onBeforeInput",["compositionend","keypress","textInput","paste"]),_e("onCompositionEnd","compositionend focusout keydown keypress keyup mousedown".split(" ")),_e("onCompositionStart","compositionstart focusout keydown keypress keyup mousedown".split(" ")),_e("onCompositionUpdate","compositionupdate focusout keydown keypress keyup mousedown".split(" "));var du="abort canplay canplaythrough durationchange emptied encrypted ended error loadeddata loadedmetadata loadstart pause play playing progress ratechange resize seeked seeking stalled suspend timeupdate volumechange waiting".split(" "),py=new Set("beforetoggle cancel close invalid load scroll scrollend toggle".split(" ").concat(du));function Bd(l,t){t=(t&4)!==0;for(var e=0;e<l.length;e++){var a=l[e],u=a.event;a=a.listeners;l:{var n=void 0;if(t)for(var c=a.length-1;0<=c;c--){var i=a[c],f=i.instance,r=i.currentTarget;if(i=i.listener,f!==n&&u.isPropagationStopped())break l;n=i,u.currentTarget=r;try{n(u)}catch(b){Yu(b)}u.currentTarget=null,n=f}else for(c=0;c<a.length;c++){if(i=a[c],f=i.instance,r=i.currentTarget,i=i.listener,f!==n&&u.isPropagationStopped())break l;n=i,u.currentTarget=r;try{n(u)}catch(b){Yu(b)}u.currentTarget=null,n=f}}}}function I(l,t){var e=t[In];e===void 0&&(e=t[In]=new Set);var a=l+"__bubble";e.has(a)||(qd(t,l,2,!1),e.add(a))}function Qi(l,t,e){var a=0;t&&(a|=4),qd(e,l,a,t)}var _n="_reactListening"+Math.random().toString(36).slice(2);function Xi(l){if(!l[_n]){l[_n]=!0,Nf.forEach(function(e){e!=="selectionchange"&&(py.has(e)||Qi(e,!1,l),Qi(e,!0,l))});var t=l.nodeType===9?l:l.ownerDocument;t===null||t[_n]||(t[_n]=!0,Qi("selectionchange",!1,t))}}function qd(l,t,e,a){switch(yo(t)){case 2:var u=Iy;break;case 8:u=Py;break;default:u=ef}e=u.bind(null,t,e,l),u=void 0,!ic||t!=="touchstart"&&t!=="touchmove"&&t!=="wheel"||(u=!0),a?u!==void 0?l.addEventListener(t,e,{capture:!0,passive:u}):l.addEventListener(t,e,!0):u!==void 0?l.addEventListener(t,e,{passive:u}):l.addEventListener(t,e,!1)}function Zi(l,t,e,a,u){var n=a;if((t&1)===0&&(t&2)===0&&a!==null)l:for(;;){if(a===null)return;var c=a.tag;if(c===3||c===4){var i=a.stateNode.containerInfo;if(i===u)break;if(c===4)for(c=a.return;c!==null;){var f=c.tag;if((f===3||f===4)&&c.stateNode.containerInfo===u)return;c=c.return}for(;i!==null;){if(c=Ve(i),c===null)return;if(f=c.tag,f===5||f===6||f===26||f===27){a=n=c;continue l}i=i.parentNode}}a=a.return}Zf(function(){var r=n,b=nc(e),A=[];l:{var v=vs.get(l);if(v!==void 0){var S=Hu,j=l;switch(l){case"keypress":if(Ru(e)===0)break l;case"keydown":case"keyup":S=gm;break;case"focusin":j="focus",S=oc;break;case"focusout":j="blur",S=oc;break;case"beforeblur":case"afterblur":S=oc;break;case"click":if(e.button===2)break l;case"auxclick":case"dblclick":case"mousedown":case"mousemove":case"mouseup":case"mouseout":case"mouseover":case"contextmenu":S=Kf;break;case"drag":case"dragend":case"dragenter":case"dragexit":case"dragleave":case"dragover":case"dragstart":case"drop":S=nm;break;case"touchcancel":case"touchend":case"touchmove":case"touchstart":S=zm;break;case ms:case ys:case rs:S=fm;break;case hs:S=Tm;break;case"scroll":case"scrollend":S=am;break;case"wheel":S=pm;break;case"copy":case"cut":case"paste":S=dm;break;case"gotpointercapture":case"lostpointercapture":case"pointercancel":case"pointerdown":case"pointermove":case"pointerout":case"pointerover":case"pointerup":S=wf;break;case"toggle":case"beforetoggle":S=Om}var Z=(t&4)!==0,bl=!Z&&(l==="scroll"||l==="scrollend"),m=Z?v!==null?v+"Capture":null:v;Z=[];for(var s=r,y;s!==null;){var T=s;if(y=T.stateNode,T=T.tag,T!==5&&T!==26&&T!==27||y===null||m===null||(T=Ca(s,m),T!=null&&Z.push(ou(s,T,y))),bl)break;s=s.return}0<Z.length&&(v=new S(v,j,null,e,b),A.push({event:v,listeners:Z}))}}if((t&7)===0){l:{if(v=l==="mouseover"||l==="pointerover",S=l==="mouseout"||l==="pointerout",v&&e!==uc&&(j=e.relatedTarget||e.fromElement)&&(Ve(j)||j[Ze]))break l;if((S||v)&&(v=b.window===b?b:(v=b.ownerDocument)?v.defaultView||v.parentWindow:window,S?(j=e.relatedTarget||e.toElement,S=r,j=j?Ve(j):null,j!==null&&(bl=Y(j),Z=j.tag,j!==bl||Z!==5&&Z!==27&&Z!==6)&&(j=null)):(S=null,j=r),S!==j)){if(Z=Kf,T="onMouseLeave",m="onMouseEnter",s="mouse",(l==="pointerout"||l==="pointerover")&&(Z=wf,T="onPointerLeave",m="onPointerEnter",s="pointer"),bl=S==null?v:Ua(S),y=j==null?v:Ua(j),v=new Z(T,s+"leave",S,e,b),v.target=bl,v.relatedTarget=y,T=null,Ve(b)===r&&(Z=new Z(m,s+"enter",j,e,b),Z.target=y,Z.relatedTarget=bl,T=Z),bl=T,S&&j)t:{for(Z=_y,m=S,s=j,y=0,T=m;T;T=Z(T))y++;T=0;for(var Q=s;Q;Q=Z(Q))T++;for(;0<y-T;)m=Z(m),y--;for(;0<T-y;)s=Z(s),T--;for(;y--;){if(m===s||s!==null&&m===s.alternate){Z=m;break t}m=Z(m),s=Z(s)}Z=null}else Z=null;S!==null&&Yd(A,v,S,Z,!1),j!==null&&bl!==null&&Yd(A,bl,j,Z,!0)}}l:{if(v=r?Ua(r):window,S=v.nodeName&&v.nodeName.toLowerCase(),S==="select"||S==="input"&&v.type==="file")var cl=ts;else if(Pf(v))if(es)cl=Bm;else{cl=jm;var B=Rm}else S=v.nodeName,!S||S.toLowerCase()!=="input"||v.type!=="checkbox"&&v.type!=="radio"?r&&ac(r.elementType)&&(cl=ts):cl=Hm;if(cl&&(cl=cl(l,r))){ls(A,cl,e,b);break l}B&&B(l,v,r),l==="focusout"&&r&&v.type==="number"&&r.memoizedProps.value!=null&&ec(v,"number",v.value)}switch(B=r?Ua(r):window,l){case"focusin":(Pf(B)||B.contentEditable==="true")&&(Ie=B,gc=r,Qa=null);break;case"focusout":Qa=gc=Ie=null;break;case"mousedown":Sc=!0;break;case"contextmenu":case"mouseup":case"dragend":Sc=!1,ds(A,e,b);break;case"selectionchange":if(Ym)break;case"keydown":case"keyup":ds(A,e,b)}var w;if(yc)l:{switch(l){case"compositionstart":var tl="onCompositionStart";break l;case"compositionend":tl="onCompositionEnd";break l;case"compositionupdate":tl="onCompositionUpdate";break l}tl=void 0}else Fe?Ff(l,e)&&(tl="onCompositionEnd"):l==="keydown"&&e.keyCode===229&&(tl="onCompositionStart");tl&&(Wf&&e.locale!=="ko"&&(Fe||tl!=="onCompositionStart"?tl==="onCompositionEnd"&&Fe&&(w=Vf()):(le=b,fc="value"in le?le.value:le.textContent,Fe=!0)),B=On(r,tl),0<B.length&&(tl=new Jf(tl,l,null,e,b),A.push({event:tl,listeners:B}),w?tl.data=w:(w=If(e),w!==null&&(tl.data=w)))),(w=xm?Dm(l,e):Nm(l,e))&&(tl=On(r,"onBeforeInput"),0<tl.length&&(B=new Jf("onBeforeInput","beforeinput",null,e,b),A.push({event:B,listeners:tl}),B.data=w)),Ey(A,l,r,e,b)}Bd(A,t)})}function ou(l,t,e){return{instance:l,listener:t,currentTarget:e}}function On(l,t){for(var e=t+"Capture",a=[];l!==null;){var u=l,n=u.stateNode;if(u=u.tag,u!==5&&u!==26&&u!==27||n===null||(u=Ca(l,e),u!=null&&a.unshift(ou(l,u,n)),u=Ca(l,t),u!=null&&a.push(ou(l,u,n))),l.tag===3)return a;l=l.return}return[]}function _y(l){if(l===null)return null;do l=l.return;while(l&&l.tag!==5&&l.tag!==27);return l||null}function Yd(l,t,e,a,u){for(var n=t._reactName,c=[];e!==null&&e!==a;){var i=e,f=i.alternate,r=i.stateNode;if(i=i.tag,f!==null&&f===a)break;i!==5&&i!==26&&i!==27||r===null||(f=r,u?(r=Ca(e,n),r!=null&&c.unshift(ou(e,r,f))):u||(r=Ca(e,n),r!=null&&c.push(ou(e,r,f)))),e=e.return}c.length!==0&&l.push({event:t,listeners:c})}var Oy=/\r\n?/g,My=/\u0000|\uFFFD/g;function Gd(l){return(typeof l=="string"?l:""+l).replace(Oy,`
+`).replace(My,"")}function Qd(l,t){return t=Gd(t),Gd(l)===t}function Sl(l,t,e,a,u,n){switch(e){case"children":typeof a=="string"?t==="body"||t==="textarea"&&a===""||We(l,a):(typeof a=="number"||typeof a=="bigint")&&t!=="body"&&We(l,""+a);break;case"className":Du(l,"class",a);break;case"tabIndex":Du(l,"tabindex",a);break;case"dir":case"role":case"viewBox":case"width":case"height":Du(l,e,a);break;case"style":Qf(l,a,n);break;case"data":if(t!=="object"){Du(l,"data",a);break}case"src":case"href":if(a===""&&(t!=="a"||e!=="href")){l.removeAttribute(e);break}if(a==null||typeof a=="function"||typeof a=="symbol"||typeof a=="boolean"){l.removeAttribute(e);break}a=Uu(""+a),l.setAttribute(e,a);break;case"action":case"formAction":if(typeof a=="function"){l.setAttribute(e,"javascript:throw new Error('A React form was unexpectedly submitted. If you called form.submit() manually, consider using form.requestSubmit() instead. If you\\'re trying to use event.stopPropagation() in a submit event handler, consider also calling event.preventDefault().')");break}else typeof n=="function"&&(e==="formAction"?(t!=="input"&&Sl(l,t,"name",u.name,u,null),Sl(l,t,"formEncType",u.formEncType,u,null),Sl(l,t,"formMethod",u.formMethod,u,null),Sl(l,t,"formTarget",u.formTarget,u,null)):(Sl(l,t,"encType",u.encType,u,null),Sl(l,t,"method",u.method,u,null),Sl(l,t,"target",u.target,u,null)));if(a==null||typeof a=="symbol"||typeof a=="boolean"){l.removeAttribute(e);break}a=Uu(""+a),l.setAttribute(e,a);break;case"onClick":a!=null&&(l.onclick=Ht);break;case"onScroll":a!=null&&I("scroll",l);break;case"onScrollEnd":a!=null&&I("scrollend",l);break;case"dangerouslySetInnerHTML":if(a!=null){if(typeof a!="object"||!("__html"in a))throw Error(o(61));if(e=a.__html,e!=null){if(u.children!=null)throw Error(o(60));l.innerHTML=e}}break;case"multiple":l.multiple=a&&typeof a!="function"&&typeof a!="symbol";break;case"muted":l.muted=a&&typeof a!="function"&&typeof a!="symbol";break;case"suppressContentEditableWarning":case"suppressHydrationWarning":case"defaultValue":case"defaultChecked":case"innerHTML":case"ref":break;case"autoFocus":break;case"xlinkHref":if(a==null||typeof a=="function"||typeof a=="boolean"||typeof a=="symbol"){l.removeAttribute("xlink:href");break}e=Uu(""+a),l.setAttributeNS("http://www.w3.org/1999/xlink","xlink:href",e);break;case"contentEditable":case"spellCheck":case"draggable":case"value":case"autoReverse":case"externalResourcesRequired":case"focusable":case"preserveAlpha":a!=null&&typeof a!="function"&&typeof a!="symbol"?l.setAttribute(e,""+a):l.removeAttribute(e);break;case"inert":case"allowFullScreen":case"async":case"autoPlay":case"controls":case"default":case"defer":case"disabled":case"disablePictureInPicture":case"disableRemotePlayback":case"formNoValidate":case"hidden":case"loop":case"noModule":case"noValidate":case"open":case"playsInline":case"readOnly":case"required":case"reversed":case"scoped":case"seamless":case"itemScope":a&&typeof a!="function"&&typeof a!="symbol"?l.setAttribute(e,""):l.removeAttribute(e);break;case"capture":case"download":a===!0?l.setAttribute(e,""):a!==!1&&a!=null&&typeof a!="function"&&typeof a!="symbol"?l.setAttribute(e,a):l.removeAttribute(e);break;case"cols":case"rows":case"size":case"span":a!=null&&typeof a!="function"&&typeof a!="symbol"&&!isNaN(a)&&1<=a?l.setAttribute(e,a):l.removeAttribute(e);break;case"rowSpan":case"start":a==null||typeof a=="function"||typeof a=="symbol"||isNaN(a)?l.removeAttribute(e):l.setAttribute(e,a);break;case"popover":I("beforetoggle",l),I("toggle",l),xu(l,"popover",a);break;case"xlinkActuate":jt(l,"http://www.w3.org/1999/xlink","xlink:actuate",a);break;case"xlinkArcrole":jt(l,"http://www.w3.org/1999/xlink","xlink:arcrole",a);break;case"xlinkRole":jt(l,"http://www.w3.org/1999/xlink","xlink:role",a);break;case"xlinkShow":jt(l,"http://www.w3.org/1999/xlink","xlink:show",a);break;case"xlinkTitle":jt(l,"http://www.w3.org/1999/xlink","xlink:title",a);break;case"xlinkType":jt(l,"http://www.w3.org/1999/xlink","xlink:type",a);break;case"xmlBase":jt(l,"http://www.w3.org/XML/1998/namespace","xml:base",a);break;case"xmlLang":jt(l,"http://www.w3.org/XML/1998/namespace","xml:lang",a);break;case"xmlSpace":jt(l,"http://www.w3.org/XML/1998/namespace","xml:space",a);break;case"is":xu(l,"is",a);break;case"innerText":case"textContent":break;default:(!(2<e.length)||e[0]!=="o"&&e[0]!=="O"||e[1]!=="n"&&e[1]!=="N")&&(e=tm.get(e)||e,xu(l,e,a))}}function Vi(l,t,e,a,u,n){switch(e){case"style":Qf(l,a,n);break;case"dangerouslySetInnerHTML":if(a!=null){if(typeof a!="object"||!("__html"in a))throw Error(o(61));if(e=a.__html,e!=null){if(u.children!=null)throw Error(o(60));l.innerHTML=e}}break;case"children":typeof a=="string"?We(l,a):(typeof a=="number"||typeof a=="bigint")&&We(l,""+a);break;case"onScroll":a!=null&&I("scroll",l);break;case"onScrollEnd":a!=null&&I("scrollend",l);break;case"onClick":a!=null&&(l.onclick=Ht);break;case"suppressContentEditableWarning":case"suppressHydrationWarning":case"innerHTML":case"ref":break;case"innerText":case"textContent":break;default:if(!Uf.hasOwnProperty(e))l:{if(e[0]==="o"&&e[1]==="n"&&(u=e.endsWith("Capture"),t=e.slice(2,u?e.length-7:void 0),n=l[Fl]||null,n=n!=null?n[e]:null,typeof n=="function"&&l.removeEventListener(t,n,u),typeof a=="function")){typeof n!="function"&&n!==null&&(e in l?l[e]=null:l.hasAttribute(e)&&l.removeAttribute(e)),l.addEventListener(t,a,u);break l}e in l?l[e]=a:a===!0?l.setAttribute(e,""):xu(l,e,a)}}}function Ll(l,t,e){switch(t){case"div":case"span":case"svg":case"path":case"a":case"g":case"p":case"li":break;case"img":I("error",l),I("load",l);var a=!1,u=!1,n;for(n in e)if(e.hasOwnProperty(n)){var c=e[n];if(c!=null)switch(n){case"src":a=!0;break;case"srcSet":u=!0;break;case"children":case"dangerouslySetInnerHTML":throw Error(o(137,t));default:Sl(l,t,n,c,e,null)}}u&&Sl(l,t,"srcSet",e.srcSet,e,null),a&&Sl(l,t,"src",e.src,e,null);return;case"input":I("invalid",l);var i=n=c=u=null,f=null,r=null;for(a in e)if(e.hasOwnProperty(a)){var b=e[a];if(b!=null)switch(a){case"name":u=b;break;case"type":c=b;break;case"checked":f=b;break;case"defaultChecked":r=b;break;case"value":n=b;break;case"defaultValue":i=b;break;case"children":case"dangerouslySetInnerHTML":if(b!=null)throw Error(o(137,t));break;default:Sl(l,t,a,b,e,null)}}Bf(l,n,i,f,r,c,u,!1);return;case"select":I("invalid",l),a=c=n=null;for(u in e)if(e.hasOwnProperty(u)&&(i=e[u],i!=null))switch(u){case"value":n=i;break;case"defaultValue":c=i;break;case"multiple":a=i;default:Sl(l,t,u,i,e,null)}t=n,e=c,l.multiple=!!a,t!=null?we(l,!!a,t,!1):e!=null&&we(l,!!a,e,!0);return;case"textarea":I("invalid",l),n=u=a=null;for(c in e)if(e.hasOwnProperty(c)&&(i=e[c],i!=null))switch(c){case"value":a=i;break;case"defaultValue":u=i;break;case"children":n=i;break;case"dangerouslySetInnerHTML":if(i!=null)throw Error(o(91));break;default:Sl(l,t,c,i,e,null)}Yf(l,a,u,n);return;case"option":for(f in e)e.hasOwnProperty(f)&&(a=e[f],a!=null)&&(f==="selected"?l.selected=a&&typeof a!="function"&&typeof a!="symbol":Sl(l,t,f,a,e,null));return;case"dialog":I("beforetoggle",l),I("toggle",l),I("cancel",l),I("close",l);break;case"iframe":case"object":I("load",l);break;case"video":case"audio":for(a=0;a<du.length;a++)I(du[a],l);break;case"image":I("error",l),I("load",l);break;case"details":I("toggle",l);break;case"embed":case"source":case"link":I("error",l),I("load",l);case"area":case"base":case"br":case"col":case"hr":case"keygen":case"meta":case"param":case"track":case"wbr":case"menuitem":for(r in e)if(e.hasOwnProperty(r)&&(a=e[r],a!=null))switch(r){case"children":case"dangerouslySetInnerHTML":throw Error(o(137,t));default:Sl(l,t,r,a,e,null)}return;default:if(ac(t)){for(b in e)e.hasOwnProperty(b)&&(a=e[b],a!==void 0&&Vi(l,t,b,a,e,void 0));return}}for(i in e)e.hasOwnProperty(i)&&(a=e[i],a!=null&&Sl(l,t,i,a,e,null))}function xy(l,t,e,a){switch(t){case"div":case"span":case"svg":case"path":case"a":case"g":case"p":case"li":break;case"input":var u=null,n=null,c=null,i=null,f=null,r=null,b=null;for(S in e){var A=e[S];if(e.hasOwnProperty(S)&&A!=null)switch(S){case"checked":break;case"value":break;case"defaultValue":f=A;default:a.hasOwnProperty(S)||Sl(l,t,S,null,a,A)}}for(var v in a){var S=a[v];if(A=e[v],a.hasOwnProperty(v)&&(S!=null||A!=null))switch(v){case"type":n=S;break;case"name":u=S;break;case"checked":r=S;break;case"defaultChecked":b=S;break;case"value":c=S;break;case"defaultValue":i=S;break;case"children":case"dangerouslySetInnerHTML":if(S!=null)throw Error(o(137,t));break;default:S!==A&&Sl(l,t,v,S,a,A)}}tc(l,c,i,f,r,b,n,u);return;case"select":S=c=i=v=null;for(n in e)if(f=e[n],e.hasOwnProperty(n)&&f!=null)switch(n){case"value":break;case"multiple":S=f;default:a.hasOwnProperty(n)||Sl(l,t,n,null,a,f)}for(u in a)if(n=a[u],f=e[u],a.hasOwnProperty(u)&&(n!=null||f!=null))switch(u){case"value":v=n;break;case"defaultValue":i=n;break;case"multiple":c=n;default:n!==f&&Sl(l,t,u,n,a,f)}t=i,e=c,a=S,v!=null?we(l,!!e,v,!1):!!a!=!!e&&(t!=null?we(l,!!e,t,!0):we(l,!!e,e?[]:"",!1));return;case"textarea":S=v=null;for(i in e)if(u=e[i],e.hasOwnProperty(i)&&u!=null&&!a.hasOwnProperty(i))switch(i){case"value":break;case"children":break;default:Sl(l,t,i,null,a,u)}for(c in a)if(u=a[c],n=e[c],a.hasOwnProperty(c)&&(u!=null||n!=null))switch(c){case"value":v=u;break;case"defaultValue":S=u;break;case"children":break;case"dangerouslySetInnerHTML":if(u!=null)throw Error(o(91));break;default:u!==n&&Sl(l,t,c,u,a,n)}qf(l,v,S);return;case"option":for(var j in e)v=e[j],e.hasOwnProperty(j)&&v!=null&&!a.hasOwnProperty(j)&&(j==="selected"?l.selected=!1:Sl(l,t,j,null,a,v));for(f in a)v=a[f],S=e[f],a.hasOwnProperty(f)&&v!==S&&(v!=null||S!=null)&&(f==="selected"?l.selected=v&&typeof v!="function"&&typeof v!="symbol":Sl(l,t,f,v,a,S));return;case"img":case"link":case"area":case"base":case"br":case"col":case"embed":case"hr":case"keygen":case"meta":case"param":case"source":case"track":case"wbr":case"menuitem":for(var Z in e)v=e[Z],e.hasOwnProperty(Z)&&v!=null&&!a.hasOwnProperty(Z)&&Sl(l,t,Z,null,a,v);for(r in a)if(v=a[r],S=e[r],a.hasOwnProperty(r)&&v!==S&&(v!=null||S!=null))switch(r){case"children":case"dangerouslySetInnerHTML":if(v!=null)throw Error(o(137,t));break;default:Sl(l,t,r,v,a,S)}return;default:if(ac(t)){for(var bl in e)v=e[bl],e.hasOwnProperty(bl)&&v!==void 0&&!a.hasOwnProperty(bl)&&Vi(l,t,bl,void 0,a,v);for(b in a)v=a[b],S=e[b],!a.hasOwnProperty(b)||v===S||v===void 0&&S===void 0||Vi(l,t,b,v,a,S);return}}for(var m in e)v=e[m],e.hasOwnProperty(m)&&v!=null&&!a.hasOwnProperty(m)&&Sl(l,t,m,null,a,v);for(A in a)v=a[A],S=e[A],!a.hasOwnProperty(A)||v===S||v==null&&S==null||Sl(l,t,A,v,a,S)}function Xd(l){switch(l){case"css":case"script":case"font":case"img":case"image":case"input":case"link":return!0;default:return!1}}function Dy(){if(typeof performance.getEntriesByType=="function"){for(var l=0,t=0,e=performance.getEntriesByType("resource"),a=0;a<e.length;a++){var u=e[a],n=u.transferSize,c=u.initiatorType,i=u.duration;if(n&&i&&Xd(c)){for(c=0,i=u.responseEnd,a+=1;a<e.length;a++){var f=e[a],r=f.startTime;if(r>i)break;var b=f.transferSize,A=f.initiatorType;b&&Xd(A)&&(f=f.responseEnd,c+=b*(f<i?1:(i-r)/(f-r)))}if(--a,t+=8*(n+c)/(u.duration/1e3),l++,10<l)break}}if(0<l)return t/l/1e6}return navigator.connection&&(l=navigator.connection.downlink,typeof l=="number")?l:5}var Li=null,Ki=null;function Mn(l){return l.nodeType===9?l:l.ownerDocument}function Zd(l){switch(l){case"http://www.w3.org/2000/svg":return 1;case"http://www.w3.org/1998/Math/MathML":return 2;default:return 0}}function Vd(l,t){if(l===0)switch(t){case"svg":return 1;case"math":return 2;default:return 0}return l===1&&t==="foreignObject"?0:l}function Ji(l,t){return l==="textarea"||l==="noscript"||typeof t.children=="string"||typeof t.children=="number"||typeof t.children=="bigint"||typeof t.dangerouslySetInnerHTML=="object"&&t.dangerouslySetInnerHTML!==null&&t.dangerouslySetInnerHTML.__html!=null}var wi=null;function Ny(){var l=window.event;return l&&l.type==="popstate"?l===wi?!1:(wi=l,!0):(wi=null,!1)}var Ld=typeof setTimeout=="function"?setTimeout:void 0,Uy=typeof clearTimeout=="function"?clearTimeout:void 0,Kd=typeof Promise=="function"?Promise:void 0,Cy=typeof queueMicrotask=="function"?queueMicrotask:typeof Kd<"u"?function(l){return Kd.resolve(null).then(l).catch(Ry)}:Ld;function Ry(l){setTimeout(function(){throw l})}function ge(l){return l==="head"}function Jd(l,t){var e=t,a=0;do{var u=e.nextSibling;if(l.removeChild(e),u&&u.nodeType===8)if(e=u.data,e==="/$"||e==="/&"){if(a===0){l.removeChild(u),_a(t);return}a--}else if(e==="$"||e==="$?"||e==="$~"||e==="$!"||e==="&")a++;else if(e==="html")mu(l.ownerDocument.documentElement);else if(e==="head"){e=l.ownerDocument.head,mu(e);for(var n=e.firstChild;n;){var c=n.nextSibling,i=n.nodeName;n[Na]||i==="SCRIPT"||i==="STYLE"||i==="LINK"&&n.rel.toLowerCase()==="stylesheet"||e.removeChild(n),n=c}}else e==="body"&&mu(l.ownerDocument.body);e=u}while(e);_a(t)}function wd(l,t){var e=l;l=0;do{var a=e.nextSibling;if(e.nodeType===1?t?(e._stashedDisplay=e.style.display,e.style.display="none"):(e.style.display=e._stashedDisplay||"",e.getAttribute("style")===""&&e.removeAttribute("style")):e.nodeType===3&&(t?(e._stashedText=e.nodeValue,e.nodeValue=""):e.nodeValue=e._stashedText||""),a&&a.nodeType===8)if(e=a.data,e==="/$"){if(l===0)break;l--}else e!=="$"&&e!=="$?"&&e!=="$~"&&e!=="$!"||l++;e=a}while(e)}function Wi(l){var t=l.firstChild;for(t&&t.nodeType===10&&(t=t.nextSibling);t;){var e=t;switch(t=t.nextSibling,e.nodeName){case"HTML":case"HEAD":case"BODY":Wi(e),Pn(e);continue;case"SCRIPT":case"STYLE":continue;case"LINK":if(e.rel.toLowerCase()==="stylesheet")continue}l.removeChild(e)}}function jy(l,t,e,a){for(;l.nodeType===1;){var u=e;if(l.nodeName.toLowerCase()!==t.toLowerCase()){if(!a&&(l.nodeName!=="INPUT"||l.type!=="hidden"))break}else if(a){if(!l[Na])switch(t){case"meta":if(!l.hasAttribute("itemprop"))break;return l;case"link":if(n=l.getAttribute("rel"),n==="stylesheet"&&l.hasAttribute("data-precedence"))break;if(n!==u.rel||l.getAttribute("href")!==(u.href==null||u.href===""?null:u.href)||l.getAttribute("crossorigin")!==(u.crossOrigin==null?null:u.crossOrigin)||l.getAttribute("title")!==(u.title==null?null:u.title))break;return l;case"style":if(l.hasAttribute("data-precedence"))break;return l;case"script":if(n=l.getAttribute("src"),(n!==(u.src==null?null:u.src)||l.getAttribute("type")!==(u.type==null?null:u.type)||l.getAttribute("crossorigin")!==(u.crossOrigin==null?null:u.crossOrigin))&&n&&l.hasAttribute("async")&&!l.hasAttribute("itemprop"))break;return l;default:return l}}else if(t==="input"&&l.type==="hidden"){var n=u.name==null?null:""+u.name;if(u.type==="hidden"&&l.getAttribute("name")===n)return l}else return l;if(l=At(l.nextSibling),l===null)break}return null}function Hy(l,t,e){if(t==="")return null;for(;l.nodeType!==3;)if((l.nodeType!==1||l.nodeName!=="INPUT"||l.type!=="hidden")&&!e||(l=At(l.nextSibling),l===null))return null;return l}function Wd(l,t){for(;l.nodeType!==8;)if((l.nodeType!==1||l.nodeName!=="INPUT"||l.type!=="hidden")&&!t||(l=At(l.nextSibling),l===null))return null;return l}function $i(l){return l.data==="$?"||l.data==="$~"}function ki(l){return l.data==="$!"||l.data==="$?"&&l.ownerDocument.readyState!=="loading"}function By(l,t){var e=l.ownerDocument;if(l.data==="$~")l._reactRetry=t;else if(l.data!=="$?"||e.readyState!=="loading")t();else{var a=function(){t(),e.removeEventListener("DOMContentLoaded",a)};e.addEventListener("DOMContentLoaded",a),l._reactRetry=a}}function At(l){for(;l!=null;l=l.nextSibling){var t=l.nodeType;if(t===1||t===3)break;if(t===8){if(t=l.data,t==="$"||t==="$!"||t==="$?"||t==="$~"||t==="&"||t==="F!"||t==="F")break;if(t==="/$"||t==="/&")return null}}return l}var Fi=null;function $d(l){l=l.nextSibling;for(var t=0;l;){if(l.nodeType===8){var e=l.data;if(e==="/$"||e==="/&"){if(t===0)return At(l.nextSibling);t--}else e!=="$"&&e!=="$!"&&e!=="$?"&&e!=="$~"&&e!=="&"||t++}l=l.nextSibling}return null}function kd(l){l=l.previousSibling;for(var t=0;l;){if(l.nodeType===8){var e=l.data;if(e==="$"||e==="$!"||e==="$?"||e==="$~"||e==="&"){if(t===0)return l;t--}else e!=="/$"&&e!=="/&"||t++}l=l.previousSibling}return null}function Fd(l,t,e){switch(t=Mn(e),l){case"html":if(l=t.documentElement,!l)throw Error(o(452));return l;case"head":if(l=t.head,!l)throw Error(o(453));return l;case"body":if(l=t.body,!l)throw Error(o(454));return l;default:throw Error(o(451))}}function mu(l){for(var t=l.attributes;t.length;)l.removeAttributeNode(t[0]);Pn(l)}var pt=new Map,Id=new Set;function xn(l){return typeof l.getRootNode=="function"?l.getRootNode():l.nodeType===9?l:l.ownerDocument}var Ft=D.d;D.d={f:qy,r:Yy,D:Gy,C:Qy,L:Xy,m:Zy,X:Ly,S:Vy,M:Ky};function qy(){var l=Ft.f(),t=bn();return l||t}function Yy(l){var t=Le(l);t!==null&&t.tag===5&&t.type==="form"?h0(t):Ft.r(l)}var Ta=typeof document>"u"?null:document;function Pd(l,t,e){var a=Ta;if(a&&typeof t=="string"&&t){var u=vt(t);u='link[rel="'+l+'"][href="'+u+'"]',typeof e=="string"&&(u+='[crossorigin="'+e+'"]'),Id.has(u)||(Id.add(u),l={rel:l,crossOrigin:e,href:t},a.querySelector(u)===null&&(t=a.createElement("link"),Ll(t,"link",l),Yl(t),a.head.appendChild(t)))}}function Gy(l){Ft.D(l),Pd("dns-prefetch",l,null)}function Qy(l,t){Ft.C(l,t),Pd("preconnect",l,t)}function Xy(l,t,e){Ft.L(l,t,e);var a=Ta;if(a&&l&&t){var u='link[rel="preload"][as="'+vt(t)+'"]';t==="image"&&e&&e.imageSrcSet?(u+='[imagesrcset="'+vt(e.imageSrcSet)+'"]',typeof e.imageSizes=="string"&&(u+='[imagesizes="'+vt(e.imageSizes)+'"]')):u+='[href="'+vt(l)+'"]';var n=u;switch(t){case"style":n=Aa(l);break;case"script":n=pa(l)}pt.has(n)||(l=U({rel:"preload",href:t==="image"&&e&&e.imageSrcSet?void 0:l,as:t},e),pt.set(n,l),a.querySelector(u)!==null||t==="style"&&a.querySelector(yu(n))||t==="script"&&a.querySelector(ru(n))||(t=a.createElement("link"),Ll(t,"link",l),Yl(t),a.head.appendChild(t)))}}function Zy(l,t){Ft.m(l,t);var e=Ta;if(e&&l){var a=t&&typeof t.as=="string"?t.as:"script",u='link[rel="modulepreload"][as="'+vt(a)+'"][href="'+vt(l)+'"]',n=u;switch(a){case"audioworklet":case"paintworklet":case"serviceworker":case"sharedworker":case"worker":case"script":n=pa(l)}if(!pt.has(n)&&(l=U({rel:"modulepreload",href:l},t),pt.set(n,l),e.querySelector(u)===null)){switch(a){case"audioworklet":case"paintworklet":case"serviceworker":case"sharedworker":case"worker":case"script":if(e.querySelector(ru(n)))return}a=e.createElement("link"),Ll(a,"link",l),Yl(a),e.head.appendChild(a)}}}function Vy(l,t,e){Ft.S(l,t,e);var a=Ta;if(a&&l){var u=Ke(a).hoistableStyles,n=Aa(l);t=t||"default";var c=u.get(n);if(!c){var i={loading:0,preload:null};if(c=a.querySelector(yu(n)))i.loading=5;else{l=U({rel:"stylesheet",href:l,"data-precedence":t},e),(e=pt.get(n))&&Ii(l,e);var f=c=a.createElement("link");Yl(f),Ll(f,"link",l),f._p=new Promise(function(r,b){f.onload=r,f.onerror=b}),f.addEventListener("load",function(){i.loading|=1}),f.addEventListener("error",function(){i.loading|=2}),i.loading|=4,Dn(c,t,a)}c={type:"stylesheet",instance:c,count:1,state:i},u.set(n,c)}}}function Ly(l,t){Ft.X(l,t);var e=Ta;if(e&&l){var a=Ke(e).hoistableScripts,u=pa(l),n=a.get(u);n||(n=e.querySelector(ru(u)),n||(l=U({src:l,async:!0},t),(t=pt.get(u))&&Pi(l,t),n=e.createElement("script"),Yl(n),Ll(n,"link",l),e.head.appendChild(n)),n={type:"script",instance:n,count:1,state:null},a.set(u,n))}}function Ky(l,t){Ft.M(l,t);var e=Ta;if(e&&l){var a=Ke(e).hoistableScripts,u=pa(l),n=a.get(u);n||(n=e.querySelector(ru(u)),n||(l=U({src:l,async:!0,type:"module"},t),(t=pt.get(u))&&Pi(l,t),n=e.createElement("script"),Yl(n),Ll(n,"link",l),e.head.appendChild(n)),n={type:"script",instance:n,count:1,state:null},a.set(u,n))}}function lo(l,t,e,a){var u=(u=K.current)?xn(u):null;if(!u)throw Error(o(446));switch(l){case"meta":case"title":return null;case"style":return typeof e.precedence=="string"&&typeof e.href=="string"?(t=Aa(e.href),e=Ke(u).hoistableStyles,a=e.get(t),a||(a={type:"style",instance:null,count:0,state:null},e.set(t,a)),a):{type:"void",instance:null,count:0,state:null};case"link":if(e.rel==="stylesheet"&&typeof e.href=="string"&&typeof e.precedence=="string"){l=Aa(e.href);var n=Ke(u).hoistableStyles,c=n.get(l);if(c||(u=u.ownerDocument||u,c={type:"stylesheet",instance:null,count:0,state:{loading:0,preload:null}},n.set(l,c),(n=u.querySelector(yu(l)))&&!n._p&&(c.instance=n,c.state.loading=5),pt.has(l)||(e={rel:"preload",as:"style",href:e.href,crossOrigin:e.crossOrigin,integrity:e.integrity,media:e.media,hrefLang:e.hrefLang,referrerPolicy:e.referrerPolicy},pt.set(l,e),n||Jy(u,l,e,c.state))),t&&a===null)throw Error(o(528,""));return c}if(t&&a!==null)throw Error(o(529,""));return null;case"script":return t=e.async,e=e.src,typeof e=="string"&&t&&typeof t!="function"&&typeof t!="symbol"?(t=pa(e),e=Ke(u).hoistableScripts,a=e.get(t),a||(a={type:"script",instance:null,count:0,state:null},e.set(t,a)),a):{type:"void",instance:null,count:0,state:null};default:throw Error(o(444,l))}}function Aa(l){return'href="'+vt(l)+'"'}function yu(l){return'link[rel="stylesheet"]['+l+"]"}function to(l){return U({},l,{"data-precedence":l.precedence,precedence:null})}function Jy(l,t,e,a){l.querySelector('link[rel="preload"][as="style"]['+t+"]")?a.loading=1:(t=l.createElement("link"),a.preload=t,t.addEventListener("load",function(){return a.loading|=1}),t.addEventListener("error",function(){return a.loading|=2}),Ll(t,"link",e),Yl(t),l.head.appendChild(t))}function pa(l){return'[src="'+vt(l)+'"]'}function ru(l){return"script[async]"+l}function eo(l,t,e){if(t.count++,t.instance===null)switch(t.type){case"style":var a=l.querySelector('style[data-href~="'+vt(e.href)+'"]');if(a)return t.instance=a,Yl(a),a;var u=U({},e,{"data-href":e.href,"data-precedence":e.precedence,href:null,precedence:null});return a=(l.ownerDocument||l).createElement("style"),Yl(a),Ll(a,"style",u),Dn(a,e.precedence,l),t.instance=a;case"stylesheet":u=Aa(e.href);var n=l.querySelector(yu(u));if(n)return t.state.loading|=4,t.instance=n,Yl(n),n;a=to(e),(u=pt.get(u))&&Ii(a,u),n=(l.ownerDocument||l).createElement("link"),Yl(n);var c=n;return c._p=new Promise(function(i,f){c.onload=i,c.onerror=f}),Ll(n,"link",a),t.state.loading|=4,Dn(n,e.precedence,l),t.instance=n;case"script":return n=pa(e.src),(u=l.querySelector(ru(n)))?(t.instance=u,Yl(u),u):(a=e,(u=pt.get(n))&&(a=U({},e),Pi(a,u)),l=l.ownerDocument||l,u=l.createElement("script"),Yl(u),Ll(u,"link",a),l.head.appendChild(u),t.instance=u);case"void":return null;default:throw Error(o(443,t.type))}else t.type==="stylesheet"&&(t.state.loading&4)===0&&(a=t.instance,t.state.loading|=4,Dn(a,e.precedence,l));return t.instance}function Dn(l,t,e){for(var a=e.querySelectorAll('link[rel="stylesheet"][data-precedence],style[data-precedence]'),u=a.length?a[a.length-1]:null,n=u,c=0;c<a.length;c++){var i=a[c];if(i.dataset.precedence===t)n=i;else if(n!==u)break}n?n.parentNode.insertBefore(l,n.nextSibling):(t=e.nodeType===9?e.head:e,t.insertBefore(l,t.firstChild))}function Ii(l,t){l.crossOrigin==null&&(l.crossOrigin=t.crossOrigin),l.referrerPolicy==null&&(l.referrerPolicy=t.referrerPolicy),l.title==null&&(l.title=t.title)}function Pi(l,t){l.crossOrigin==null&&(l.crossOrigin=t.crossOrigin),l.referrerPolicy==null&&(l.referrerPolicy=t.referrerPolicy),l.integrity==null&&(l.integrity=t.integrity)}var Nn=null;function ao(l,t,e){if(Nn===null){var a=new Map,u=Nn=new Map;u.set(e,a)}else u=Nn,a=u.get(e),a||(a=new Map,u.set(e,a));if(a.has(l))return a;for(a.set(l,null),e=e.getElementsByTagName(l),u=0;u<e.length;u++){var n=e[u];if(!(n[Na]||n[Ql]||l==="link"&&n.getAttribute("rel")==="stylesheet")&&n.namespaceURI!=="http://www.w3.org/2000/svg"){var c=n.getAttribute(t)||"";c=l+c;var i=a.get(c);i?i.push(n):a.set(c,[n])}}return a}function uo(l,t,e){l=l.ownerDocument||l,l.head.insertBefore(e,t==="title"?l.querySelector("head > title"):null)}function wy(l,t,e){if(e===1||t.itemProp!=null)return!1;switch(l){case"meta":case"title":return!0;case"style":if(typeof t.precedence!="string"||typeof t.href!="string"||t.href==="")break;return!0;case"link":if(typeof t.rel!="string"||typeof t.href!="string"||t.href===""||t.onLoad||t.onError)break;return t.rel==="stylesheet"?(l=t.disabled,typeof t.precedence=="string"&&l==null):!0;case"script":if(t.async&&typeof t.async!="function"&&typeof t.async!="symbol"&&!t.onLoad&&!t.onError&&t.src&&typeof t.src=="string")return!0}return!1}function no(l){return!(l.type==="stylesheet"&&(l.state.loading&3)===0)}function Wy(l,t,e,a){if(e.type==="stylesheet"&&(typeof a.media!="string"||matchMedia(a.media).matches!==!1)&&(e.state.loading&4)===0){if(e.instance===null){var u=Aa(a.href),n=t.querySelector(yu(u));if(n){t=n._p,t!==null&&typeof t=="object"&&typeof t.then=="function"&&(l.count++,l=Un.bind(l),t.then(l,l)),e.state.loading|=4,e.instance=n,Yl(n);return}n=t.ownerDocument||t,a=to(a),(u=pt.get(u))&&Ii(a,u),n=n.createElement("link"),Yl(n);var c=n;c._p=new Promise(function(i,f){c.onload=i,c.onerror=f}),Ll(n,"link",a),e.instance=n}l.stylesheets===null&&(l.stylesheets=new Map),l.stylesheets.set(e,t),(t=e.state.preload)&&(e.state.loading&3)===0&&(l.count++,e=Un.bind(l),t.addEventListener("load",e),t.addEventListener("error",e))}}var lf=0;function $y(l,t){return l.stylesheets&&l.count===0&&Rn(l,l.stylesheets),0<l.count||0<l.imgCount?function(e){var a=setTimeout(function(){if(l.stylesheets&&Rn(l,l.stylesheets),l.unsuspend){var n=l.unsuspend;l.unsuspend=null,n()}},6e4+t);0<l.imgBytes&&lf===0&&(lf=62500*Dy());var u=setTimeout(function(){if(l.waitingForImages=!1,l.count===0&&(l.stylesheets&&Rn(l,l.stylesheets),l.unsuspend)){var n=l.unsuspend;l.unsuspend=null,n()}},(l.imgBytes>lf?50:800)+t);return l.unsuspend=e,function(){l.unsuspend=null,clearTimeout(a),clearTimeout(u)}}:null}function Un(){if(this.count--,this.count===0&&(this.imgCount===0||!this.waitingForImages)){if(this.stylesheets)Rn(this,this.stylesheets);else if(this.unsuspend){var l=this.unsuspend;this.unsuspend=null,l()}}}var Cn=null;function Rn(l,t){l.stylesheets=null,l.unsuspend!==null&&(l.count++,Cn=new Map,t.forEach(ky,l),Cn=null,Un.call(l))}function ky(l,t){if(!(t.state.loading&4)){var e=Cn.get(l);if(e)var a=e.get(null);else{e=new Map,Cn.set(l,e);for(var u=l.querySelectorAll("link[data-precedence],style[data-precedence]"),n=0;n<u.length;n++){var c=u[n];(c.nodeName==="LINK"||c.getAttribute("media")!=="not all")&&(e.set(c.dataset.precedence,c),a=c)}a&&e.set(null,a)}u=t.instance,c=u.getAttribute("data-precedence"),n=e.get(c)||a,n===a&&e.set(null,u),e.set(c,u),this.count++,a=Un.bind(this),u.addEventListener("load",a),u.addEventListener("error",a),n?n.parentNode.insertBefore(u,n.nextSibling):(l=l.nodeType===9?l.head:l,l.insertBefore(u,l.firstChild)),t.state.loading|=4}}var hu={$$typeof:nl,Provider:null,Consumer:null,_currentValue:V,_currentValue2:V,_threadCount:0};function Fy(l,t,e,a,u,n,c,i,f){this.tag=1,this.containerInfo=l,this.pingCache=this.current=this.pendingChildren=null,this.timeoutHandle=-1,this.callbackNode=this.next=this.pendingContext=this.context=this.cancelPendingCommit=null,this.callbackPriority=0,this.expirationTimes=$n(-1),this.entangledLanes=this.shellSuspendCounter=this.errorRecoveryDisabledLanes=this.expiredLanes=this.warmLanes=this.pingedLanes=this.suspendedLanes=this.pendingLanes=0,this.entanglements=$n(0),this.hiddenUpdates=$n(null),this.identifierPrefix=a,this.onUncaughtError=u,this.onCaughtError=n,this.onRecoverableError=c,this.pooledCache=null,this.pooledCacheLanes=0,this.formState=f,this.incompleteTransitions=new Map}function co(l,t,e,a,u,n,c,i,f,r,b,A){return l=new Fy(l,t,e,c,f,r,b,A,i),t=1,n===!0&&(t|=24),n=st(3,null,null,t),l.current=n,n.stateNode=l,t=Rc(),t.refCount++,l.pooledCache=t,t.refCount++,n.memoizedState={element:a,isDehydrated:e,cache:t},qc(n),l}function io(l){return l?(l=ta,l):ta}function fo(l,t,e,a,u,n){u=io(u),a.context===null?a.context=u:a.pendingContext=u,a=ce(t),a.payload={element:e},n=n===void 0?null:n,n!==null&&(a.callback=n),e=ie(l,a,t),e!==null&&(at(e,l,t),wa(e,l,t))}function so(l,t){if(l=l.memoizedState,l!==null&&l.dehydrated!==null){var e=l.retryLane;l.retryLane=e!==0&&e<t?e:t}}function tf(l,t){so(l,t),(l=l.alternate)&&so(l,t)}function oo(l){if(l.tag===13||l.tag===31){var t=De(l,67108864);t!==null&&at(t,l,67108864),tf(l,67108864)}}function mo(l){if(l.tag===13||l.tag===31){var t=rt();t=kn(t);var e=De(l,t);e!==null&&at(e,l,t),tf(l,t)}}var jn=!0;function Iy(l,t,e,a){var u=z.T;z.T=null;var n=D.p;try{D.p=2,ef(l,t,e,a)}finally{D.p=n,z.T=u}}function Py(l,t,e,a){var u=z.T;z.T=null;var n=D.p;try{D.p=8,ef(l,t,e,a)}finally{D.p=n,z.T=u}}function ef(l,t,e,a){if(jn){var u=af(a);if(u===null)Zi(l,t,a,Hn,e),ro(l,a);else if(tr(u,l,t,e,a))a.stopPropagation();else if(ro(l,a),t&4&&-1<lr.indexOf(l)){for(;u!==null;){var n=Le(u);if(n!==null)switch(n.tag){case 3:if(n=n.stateNode,n.current.memoizedState.isDehydrated){var c=pe(n.pendingLanes);if(c!==0){var i=n;for(i.pendingLanes|=2,i.entangledLanes|=2;c;){var f=1<<31-it(c);i.entanglements[1]|=f,c&=~f}Ct(n),(sl&6)===0&&(gn=nt()+500,su(0))}}break;case 31:case 13:i=De(n,2),i!==null&&at(i,n,2),bn(),tf(n,2)}if(n=af(a),n===null&&Zi(l,t,a,Hn,e),n===u)break;u=n}u!==null&&a.stopPropagation()}else Zi(l,t,a,null,e)}}function af(l){return l=nc(l),uf(l)}var Hn=null;function uf(l){if(Hn=null,l=Ve(l),l!==null){var t=Y(l);if(t===null)l=null;else{var e=t.tag;if(e===13){if(l=H(t),l!==null)return l;l=null}else if(e===31){if(l=R(t),l!==null)return l;l=null}else if(e===3){if(t.stateNode.current.memoizedState.isDehydrated)return t.tag===3?t.stateNode.containerInfo:null;l=null}else t!==l&&(l=null)}}return Hn=l,null}function yo(l){switch(l){case"beforetoggle":case"cancel":case"click":case"close":case"contextmenu":case"copy":case"cut":case"auxclick":case"dblclick":case"dragend":case"dragstart":case"drop":case"focusin":case"focusout":case"input":case"invalid":case"keydown":case"keypress":case"keyup":case"mousedown":case"mouseup":case"paste":case"pause":case"play":case"pointercancel":case"pointerdown":case"pointerup":case"ratechange":case"reset":case"resize":case"seeked":case"submit":case"toggle":case"touchcancel":case"touchend":case"touchstart":case"volumechange":case"change":case"selectionchange":case"textInput":case"compositionstart":case"compositionend":case"compositionupdate":case"beforeblur":case"afterblur":case"beforeinput":case"blur":case"fullscreenchange":case"focus":case"hashchange":case"popstate":case"select":case"selectstart":return 2;case"drag":case"dragenter":case"dragexit":case"dragleave":case"dragover":case"mousemove":case"mouseout":case"mouseover":case"pointermove":case"pointerout":case"pointerover":case"scroll":case"touchmove":case"wheel":case"mouseenter":case"mouseleave":case"pointerenter":case"pointerleave":return 8;case"message":switch(Go()){case zf:return 2;case Ef:return 8;case Au:case Qo:return 32;case Tf:return 268435456;default:return 32}default:return 32}}var nf=!1,Se=null,be=null,ze=null,vu=new Map,gu=new Map,Ee=[],lr="mousedown mouseup touchcancel touchend touchstart auxclick dblclick pointercancel pointerdown pointerup dragend dragstart drop compositionend compositionstart keydown keypress keyup input textInput copy cut paste click change contextmenu reset".split(" ");function ro(l,t){switch(l){case"focusin":case"focusout":Se=null;break;case"dragenter":case"dragleave":be=null;break;case"mouseover":case"mouseout":ze=null;break;case"pointerover":case"pointerout":vu.delete(t.pointerId);break;case"gotpointercapture":case"lostpointercapture":gu.delete(t.pointerId)}}function Su(l,t,e,a,u,n){return l===null||l.nativeEvent!==n?(l={blockedOn:t,domEventName:e,eventSystemFlags:a,nativeEvent:n,targetContainers:[u]},t!==null&&(t=Le(t),t!==null&&oo(t)),l):(l.eventSystemFlags|=a,t=l.targetContainers,u!==null&&t.indexOf(u)===-1&&t.push(u),l)}function tr(l,t,e,a,u){switch(t){case"focusin":return Se=Su(Se,l,t,e,a,u),!0;case"dragenter":return be=Su(be,l,t,e,a,u),!0;case"mouseover":return ze=Su(ze,l,t,e,a,u),!0;case"pointerover":var n=u.pointerId;return vu.set(n,Su(vu.get(n)||null,l,t,e,a,u)),!0;case"gotpointercapture":return n=u.pointerId,gu.set(n,Su(gu.get(n)||null,l,t,e,a,u)),!0}return!1}function ho(l){var t=Ve(l.target);if(t!==null){var e=Y(t);if(e!==null){if(t=e.tag,t===13){if(t=H(e),t!==null){l.blockedOn=t,xf(l.priority,function(){mo(e)});return}}else if(t===31){if(t=R(e),t!==null){l.blockedOn=t,xf(l.priority,function(){mo(e)});return}}else if(t===3&&e.stateNode.current.memoizedState.isDehydrated){l.blockedOn=e.tag===3?e.stateNode.containerInfo:null;return}}}l.blockedOn=null}function Bn(l){if(l.blockedOn!==null)return!1;for(var t=l.targetContainers;0<t.length;){var e=af(l.nativeEvent);if(e===null){e=l.nativeEvent;var a=new e.constructor(e.type,e);uc=a,e.target.dispatchEvent(a),uc=null}else return t=Le(e),t!==null&&oo(t),l.blockedOn=e,!1;t.shift()}return!0}function vo(l,t,e){Bn(l)&&e.delete(t)}function er(){nf=!1,Se!==null&&Bn(Se)&&(Se=null),be!==null&&Bn(be)&&(be=null),ze!==null&&Bn(ze)&&(ze=null),vu.forEach(vo),gu.forEach(vo)}function qn(l,t){l.blockedOn===t&&(l.blockedOn=null,nf||(nf=!0,h.unstable_scheduleCallback(h.unstable_NormalPriority,er)))}var Yn=null;function go(l){Yn!==l&&(Yn=l,h.unstable_scheduleCallback(h.unstable_NormalPriority,function(){Yn===l&&(Yn=null);for(var t=0;t<l.length;t+=3){var e=l[t],a=l[t+1],u=l[t+2];if(typeof a!="function"){if(uf(a||e)===null)continue;break}var n=Le(e);n!==null&&(l.splice(t,3),t-=3,ai(n,{pending:!0,data:u,method:e.method,action:a},a,u))}}))}function _a(l){function t(f){return qn(f,l)}Se!==null&&qn(Se,l),be!==null&&qn(be,l),ze!==null&&qn(ze,l),vu.forEach(t),gu.forEach(t);for(var e=0;e<Ee.length;e++){var a=Ee[e];a.blockedOn===l&&(a.blockedOn=null)}for(;0<Ee.length&&(e=Ee[0],e.blockedOn===null);)ho(e),e.blockedOn===null&&Ee.shift();if(e=(l.ownerDocument||l).$$reactFormReplay,e!=null)for(a=0;a<e.length;a+=3){var u=e[a],n=e[a+1],c=u[Fl]||null;if(typeof n=="function")c||go(e);else if(c){var i=null;if(n&&n.hasAttribute("formAction")){if(u=n,c=n[Fl]||null)i=c.formAction;else if(uf(u)!==null)continue}else i=c.action;typeof i=="function"?e[a+1]=i:(e.splice(a,3),a-=3),go(e)}}}function So(){function l(n){n.canIntercept&&n.info==="react-transition"&&n.intercept({handler:function(){return new Promise(function(c){return u=c})},focusReset:"manual",scroll:"manual"})}function t(){u!==null&&(u(),u=null),a||setTimeout(e,20)}function e(){if(!a&&!navigation.transition){var n=navigation.currentEntry;n&&n.url!=null&&navigation.navigate(n.url,{state:n.getState(),info:"react-transition",history:"replace"})}}if(typeof navigation=="object"){var a=!1,u=null;return navigation.addEventListener("navigate",l),navigation.addEventListener("navigatesuccess",t),navigation.addEventListener("navigateerror",t),setTimeout(e,100),function(){a=!0,navigation.removeEventListener("navigate",l),navigation.removeEventListener("navigatesuccess",t),navigation.removeEventListener("navigateerror",t),u!==null&&(u(),u=null)}}}function cf(l){this._internalRoot=l}Gn.prototype.render=cf.prototype.render=function(l){var t=this._internalRoot;if(t===null)throw Error(o(409));var e=t.current,a=rt();fo(e,a,l,t,null,null)},Gn.prototype.unmount=cf.prototype.unmount=function(){var l=this._internalRoot;if(l!==null){this._internalRoot=null;var t=l.containerInfo;fo(l.current,2,null,l,null,null),bn(),t[Ze]=null}};function Gn(l){this._internalRoot=l}Gn.prototype.unstable_scheduleHydration=function(l){if(l){var t=Mf();l={blockedOn:null,target:l,priority:t};for(var e=0;e<Ee.length&&t!==0&&t<Ee[e].priority;e++);Ee.splice(e,0,l),e===0&&ho(l)}};var bo=M.version;if(bo!=="19.2.4")throw Error(o(527,bo,"19.2.4"));D.findDOMNode=function(l){var t=l._reactInternals;if(t===void 0)throw typeof l.render=="function"?Error(o(188)):(l=Object.keys(l).join(","),Error(o(268,l)));return l=E(t),l=l!==null?G(l):null,l=l===null?null:l.stateNode,l};var ar={bundleType:0,version:"19.2.4",rendererPackageName:"react-dom",currentDispatcherRef:z,reconcilerVersion:"19.2.4"};if(typeof __REACT_DEVTOOLS_GLOBAL_HOOK__<"u"){var Qn=__REACT_DEVTOOLS_GLOBAL_HOOK__;if(!Qn.isDisabled&&Qn.supportsFiber)try{Ma=Qn.inject(ar),ct=Qn}catch{}}return zu.createRoot=function(l,t){if(!q(l))throw Error(o(299));var e=!1,a="",u=_0,n=O0,c=M0;return t!=null&&(t.unstable_strictMode===!0&&(e=!0),t.identifierPrefix!==void 0&&(a=t.identifierPrefix),t.onUncaughtError!==void 0&&(u=t.onUncaughtError),t.onCaughtError!==void 0&&(n=t.onCaughtError),t.onRecoverableError!==void 0&&(c=t.onRecoverableError)),t=co(l,1,!1,null,null,e,a,null,u,n,c,So),l[Ze]=t.current,Xi(l),new cf(t)},zu.hydrateRoot=function(l,t,e){if(!q(l))throw Error(o(299));var a=!1,u="",n=_0,c=O0,i=M0,f=null;return e!=null&&(e.unstable_strictMode===!0&&(a=!0),e.identifierPrefix!==void 0&&(u=e.identifierPrefix),e.onUncaughtError!==void 0&&(n=e.onUncaughtError),e.onCaughtError!==void 0&&(c=e.onCaughtError),e.onRecoverableError!==void 0&&(i=e.onRecoverableError),e.formState!==void 0&&(f=e.formState)),t=co(l,1,!0,t,e??null,a,u,f,n,c,i,So),t.context=io(null),e=t.current,a=rt(),a=kn(a),u=ce(a),u.callback=null,ie(e,u,a),e=a,t.current.lanes=e,Da(t,e),Ct(t),l[Ze]=t.current,Xi(l),new Gn(t)},zu.version="19.2.4",zu}var Do;function yr(){if(Do)return df.exports;Do=1;function h(){if(!(typeof __REACT_DEVTOOLS_GLOBAL_HOOK__>"u"||typeof __REACT_DEVTOOLS_GLOBAL_HOOK__.checkDCE!="function"))try{__REACT_DEVTOOLS_GLOBAL_HOOK__.checkDCE(h)}catch(M){console.error(M)}}return h(),df.exports=mr(),df.exports}var rr=yr();const hr=[{id:"all",label:"All"},{id:"lambda",label:"Lambda"},{id:"stateful",label:"Stateful"}],vr=[{id:"all",label:"All"},{id:"og18",label:"OG-18"},{id:"advanced_reasoning",label:"Advanced Reasoning"}],gf=["backend","mode","family","quant","replay"],gr=[{id:"reforged",label:"Reforged"},{id:"bare-vs-reforged",label:"Reforged vs Bare"},{id:"ablation",label:"Full Ablation"}],No=["reforged","bare","no_rescue","no_nudge","no_steps","no_recovery","no_compact"],Uo=["none","keep-last","full"],hf=[{id:"all",label:"All",groupBy:[]},{id:"by-backend",label:"By Backend",groupBy:["model","quant","ablation","replay"],intraSort:"backend"},{id:"by-family",label:"By Family",groupBy:["family"]}];async function Sr(){const h=window;if(!h.__FORGE_DATA__)throw new Error("window.__FORGE_DATA__ not injected — build via `python -m tests.eval.report <jsonl> --html <out>`");return h.__FORGE_DATA__}function Co(h,M){if(M==="reforged")return h.filter(o=>o.ablation==="reforged");if(M==="bare-vs-reforged")return h.filter(o=>o.ablation==="reforged"||o.ablation==="bare");const C=new Set;for(const o of h)o.ablation.startsWith("no_")&&C.add(`${o.model}\0${o.backend}\0${o.mode}`);return h.filter(o=>C.has(`${o.model}\0${o.backend}\0${o.mode}`))}function Ro(h){const M=No.indexOf(h);return M===-1?No.length:M}function Zn(h){const M=Uo.indexOf(h);return M===-1?Uo.length:M}function Eu(h){return h==null?"":h>=95?"text-emerald-400":h>=90?"text-emerald-500/80":h>=70?"text-amber-400":h>=50?"text-orange-400":"text-red-400"}function Xn(h,M=0){return h==null?"—":`${h.toFixed(M)}%`}const br={0:"⁰",1:"¹",2:"²",3:"³",4:"⁴",5:"⁵",6:"⁶",7:"⁷",8:"⁸",9:"⁹"};function zr(h){return String(h).split("").map(M=>br[M]??M).join("")}function Er(h,M,C,o,q){const Y=p=>p.endsWith("_stateful");let H=M;return C==="lambda"?H=H.filter(p=>!Y(p)):C==="stateful"&&(H=H.filter(Y)),o==="og18"?H=H.filter(p=>q[p]==="og18"):o==="advanced_reasoning"&&(H=H.filter(p=>q[p]==="advanced_reasoning")),H.length===0?{rows:h,scenarios:M}:H.length===M.length?{rows:h,scenarios:M}:{rows:h.map(p=>{let E=0,G=0,U=0,x=0,P=0,k=0,vl=0,ul=0,zl=0,$=0;for(const fl of H)E+=p.scenarioRuns?.[fl]??0,G+=p.scenarioCorrect?.[fl]??0,U+=p.scenarioCompleted?.[fl]??0,x+=p.scenarioValidated?.[fl]??0,P+=p.scenarioIdealCalls?.[fl]??0,k+=p.scenarioActualCalls?.[fl]??0,vl+=p.scenarioWastedSum?.[fl]??0,ul+=p.scenarioWastedN?.[fl]??0,zl+=p.scenarioSpeedSum?.[fl]??0,$+=p.scenarioSpeedN?.[fl]??0;const nl=fl=>Math.round(fl*10)/10,al=E>0?nl(G/E*100):0,Hl=x>0?nl(G/x*100):null,Tl=E>0?nl(U/E*100):0,W=k>0?nl(Math.min(P/k,1)*100):0,Bl=ul>0?nl(vl/ul):0,Kl=$>0?nl(zl/$):0,$l=p.scenarioCompleted!==void 0,kl=Math.max(0,...H.map(fl=>p.scenarioRuns?.[fl]??0));return{...p,score:al,accuracy:$l?Hl:p.accuracy,completeness:$l?Tl:p.completeness,efficiency:$l?W:p.efficiency,wasted:$l?Bl:p.wasted,speed:$l?Kl:p.speed,n:kl}}),scenarios:H}}function Tr(h,M,C){const o=[...h];return o.sort((q,Y)=>{let H,R;return M.col==="label"?(H=q.label,R=Y.label):C.includes(M.col)?(H=q.scenarios[M.col]??-1,R=Y.scenarios[M.col]??-1):(H=q[M.col]??-1,R=Y[M.col]??-1),typeof H=="string"&&typeof R=="string"?M.asc?H.localeCompare(R):R.localeCompare(H):M.asc?H-R:R-H}),o}function Ar(h,M){return M.map(C=>String(h[C])).join("\0")}function pr(h,M,C,o,q){const Y=q==="reforged"?M:{id:M.id,label:M.label,groupBy:["model","backend","mode"]},H=q!=="reforged";if(Y.groupBy.length===0)return{sorted:Tr(h,C,o),groups:[]};const R=new Map;for(const G of h){const U=Ar(G,Y.groupBy);R.has(U)||R.set(U,[]),R.get(U).push(G)}const p=[];for(const[G,U]of R){U.sort((P,k)=>{if(H){const ul=Ro(P.ablation)-Ro(k.ablation);if(ul!==0)return ul;const zl=Zn(P.replay)-Zn(k.replay);return zl!==0?zl:k.score-P.score}const vl=k.score-P.score;return vl!==0?vl:Y.intraSort?String(P[Y.intraSort]).localeCompare(String(k[Y.intraSort])):0});const x=Y.groupBy.map(P=>U[0][P]).join(" / ");p.push({key:G,label:x,rows:U})}return p.sort((G,U)=>{const x=Math.max(...G.rows.map(k=>k.score));return Math.max(...U.rows.map(k=>k.score))-x}),{sorted:p.flatMap(G=>G.rows),groups:p}}function _r({active:h,onChange:M}){return _.jsxs("fieldset",{className:"mb-4",children:[_.jsx("legend",{className:"text-[0.65rem] font-semibold uppercase tracking-wider text-zinc-400 px-1 mb-1",children:"Screen"}),_.jsx("div",{className:"flex flex-col rounded border border-zinc-700 overflow-hidden",children:gr.map((C,o)=>_.jsx("button",{onClick:()=>M(C.id),className:`text-xs px-2 py-1.5 text-left transition-colors ${o>0?"border-t border-zinc-700":""} ${h===C.id?"bg-emerald-500/20 text-emerald-300 font-medium":"bg-zinc-900/40 text-zinc-400 hover:bg-zinc-900/70 hover:text-zinc-200"}`,children:C.label},C.id))})]})}function Or({active:h,onChange:M}){return _.jsxs("fieldset",{className:"mb-3 border border-zinc-800 rounded p-2",children:[_.jsx("legend",{className:"text-[0.65rem] font-semibold uppercase tracking-wider text-zinc-400 px-1",children:"View"}),_.jsx("div",{className:"flex flex-wrap gap-1",children:hf.map(C=>_.jsx("button",{onClick:()=>M(C.id),className:`text-[0.65rem] px-2 py-0.5 rounded-full border transition-colors ${h===C.id?"border-emerald-500 bg-emerald-500/15 text-emerald-400":"border-zinc-700 text-zinc-500 hover:border-zinc-500 hover:text-zinc-300"}`,children:C.label},C.id))})]})}const Mr={backend:"Backend",mode:"Mode",family:"Family",quant:"Quant",replay:"Reasoning Replay"};function xr({rows:h,filters:M,onFilterChange:C,activeScreen:o,onScreenChange:q,activeView:Y,onViewChange:H,scenarioScope:R,onScopeChange:p,suiteScope:E,onSuiteChange:G,showRetired:U,onShowRetiredChange:x,hasRetired:P,filteredCount:k,totalCount:vl,totalRuns:ul,timestamp:zl}){return _.jsxs("nav",{className:"w-52 min-w-52 shrink-0 border-r border-zinc-800 p-4 sticky top-0 h-screen overflow-y-auto bg-zinc-950/80",children:[_.jsx("h1",{className:"text-lg font-semibold mb-0.5",children:"Forge Eval"}),_.jsxs("p",{className:"text-xs text-zinc-500 mb-3",children:[k,"/",vl," configs ·"," ",ul.toLocaleString()," runs"]}),_.jsx(_r,{active:o,onChange:q}),_.jsxs("fieldset",{className:"mb-3 border border-zinc-800 rounded p-2",children:[_.jsx("legend",{className:"text-[0.65rem] font-semibold uppercase tracking-wider text-zinc-400 px-1",children:"Suite"}),_.jsx("div",{className:"flex flex-wrap gap-1",children:vr.map($=>_.jsx("button",{onClick:()=>G($.id),className:`text-[0.65rem] px-2 py-0.5 rounded-full border transition-colors ${E===$.id?"border-emerald-500 bg-emerald-500/15 text-emerald-400":"border-zinc-700 text-zinc-500 hover:border-zinc-500 hover:text-zinc-300"}`,children:$.label},$.id))})]}),_.jsxs("fieldset",{className:"mb-3 border border-zinc-800 rounded p-2",children:[_.jsx("legend",{className:"text-[0.65rem] font-semibold uppercase tracking-wider text-zinc-400 px-1",children:"Scenarios"}),_.jsx("div",{className:"flex flex-wrap gap-1",children:hr.map($=>_.jsx("button",{onClick:()=>p($.id),className:`text-[0.65rem] px-2 py-0.5 rounded-full border transition-colors ${R===$.id?"border-emerald-500 bg-emerald-500/15 text-emerald-400":"border-zinc-700 text-zinc-500 hover:border-zinc-500 hover:text-zinc-300"}`,children:$.label},$.id))})]}),o==="reforged"&&_.jsx(Or,{active:Y,onChange:H}),gf.map($=>{const nl=[...new Set(h.map(al=>al[$]))].sort($==="replay"?(al,Hl)=>Zn(al)-Zn(Hl):void 0);return nl.length<2?null:_.jsxs("fieldset",{className:"mb-3 border border-zinc-800 rounded p-2",children:[_.jsx("legend",{className:"text-[0.65rem] font-semibold uppercase tracking-wider text-zinc-400 px-1",children:Mr[$]}),nl.map(al=>_.jsxs("label",{className:"flex items-center gap-1.5 text-xs py-0.5 cursor-pointer hover:text-zinc-200",children:[_.jsx("input",{type:"checkbox",checked:M[$]?.has(al)??!0,onChange:Hl=>C($,al,Hl.target.checked),className:"w-3.5 h-3.5 rounded border-zinc-600 bg-zinc-800 accent-emerald-500"}),_.jsx("span",{children:al})]},al))]},$)}),P&&_.jsxs("label",{className:"flex items-center gap-1.5 text-xs py-0.5 mt-2 cursor-pointer text-zinc-500 hover:text-zinc-300",children:[_.jsx("input",{type:"checkbox",checked:U,onChange:$=>x($.target.checked),className:"w-3.5 h-3.5 rounded border-zinc-600 bg-zinc-800 accent-emerald-500"}),_.jsx("span",{children:"Show retired models"})]}),_.jsx("p",{className:"text-[0.6rem] text-zinc-600 mt-4",children:zl})]})}const jo=[{key:"score",label:"Scr%"},{key:"accuracy",label:"Acc%"},{key:"completeness",label:"Cmp%"},{key:"efficiency",label:"Eff%"},{key:"wasted",label:"Wst"},{key:"speed",label:"Spd"},{key:"n",label:"N"}];function rf({col:h,sort:M}){return M.col!==h?null:_.jsx("span",{className:"ml-0.5 text-emerald-400",children:M.asc?"▲":"▼"})}function Dr({rows:h,scenarios:M,scenarioAbbrev:C,sort:o,onSort:q,checked:Y,onCompareToggle:H,groups:R,maxGen:p,genInfo:E}){const G=new Map;if(R.length>0){let x=0;for(const P of R)G.set(x,P.label),x+=P.rows.length}const U=2+jo.length+M.length;return _.jsx("div",{className:"w-full overflow-x-auto",children:_.jsxs("table",{className:"text-xs whitespace-nowrap border-collapse",children:[_.jsx("thead",{children:_.jsxs("tr",{className:"border-b border-zinc-800",children:[_.jsx("th",{className:"p-1.5 w-8"}),_.jsxs("th",{className:"p-1.5 text-left cursor-pointer select-none hover:text-emerald-400 sticky left-0 bg-zinc-950 z-10",onClick:()=>q("label"),children:["Model/Backend",_.jsx(rf,{col:"label",sort:o})]}),jo.map(x=>_.jsxs("th",{className:"p-1.5 text-right cursor-pointer select-none hover:text-emerald-400",onClick:()=>q(x.key),children:[x.label,_.jsx(rf,{col:x.key,sort:o})]},x.key)),M.map(x=>_.jsxs("th",{className:"p-1.5 text-right cursor-pointer select-none hover:text-emerald-400",onClick:()=>q(x),title:x,children:[C[x]||x.slice(0,3),_.jsx(rf,{col:x,sort:o})]},x))]})}),_.jsx("tbody",{children:h.map((x,P)=>{const k=Y.includes(P),vl=G.get(P);return _.jsxs(ml.Fragment,{children:[vl!=null&&_.jsx("tr",{className:"bg-zinc-900/30",children:_.jsx("td",{colSpan:U,className:"px-2 py-1 text-[0.6rem] font-semibold text-zinc-400 uppercase tracking-wider border-t border-zinc-700/50",children:vl})}),_.jsxs("tr",{className:`border-b border-zinc-900 hover:bg-zinc-900/50 transition-colors ${k?"bg-zinc-800/40":""} ${x.retired?"opacity-60":""}`,children:[_.jsx("td",{className:"p-1.5 text-center",children:_.jsx("input",{type:"checkbox",checked:k,onChange:ul=>H(P,ul.target.checked),className:"w-3.5 h-3.5 rounded border-zinc-600 bg-zinc-800 accent-emerald-500 cursor-pointer"})}),_.jsxs("td",{className:"p-1.5 font-mono sticky left-0 bg-zinc-950 z-10",children:[x.label,p>0&&x.gen<p&&_.jsx("sup",{className:"ml-0.5 text-zinc-500",title:(()=>{const ul=E?.[String(x.gen)];return ul?`gen ${x.gen}: ${ul.note} (commit ${ul.commit}, ${ul.date})`:`gen ${x.gen}`})(),children:zr(x.gen)}),x.retired&&_.jsx("span",{className:"ml-1.5 align-middle text-[0.55rem] uppercase tracking-wider text-zinc-500 border border-zinc-700 rounded px-1",children:"retired"})]}),_.jsx("td",{className:`p-1.5 text-right tabular-nums ${Eu(x.score)}`,children:Xn(x.score,1)}),_.jsx("td",{className:`p-1.5 text-right tabular-nums ${Eu(x.accuracy)}`,children:Xn(x.accuracy,1)}),_.jsx("td",{className:`p-1.5 text-right tabular-nums ${Eu(x.completeness)}`,children:Xn(x.completeness,1)}),_.jsx("td",{className:`p-1.5 text-right tabular-nums ${Eu(x.efficiency)}`,children:Xn(x.efficiency)}),_.jsx("td",{className:"p-1.5 text-right tabular-nums text-zinc-400",children:x.wasted.toFixed(1)}),_.jsxs("td",{className:"p-1.5 text-right tabular-nums text-zinc-400",children:[x.speed.toFixed(1),"s"]}),_.jsx("td",{className:"p-1.5 text-right tabular-nums text-zinc-500",children:x.n}),M.map(ul=>{const zl=x.scenarios[ul],$=x.scenarioRuns?.[ul]??0;let nl,al;return zl!=null?(nl=String(zl),al=Eu(zl)):$===0?(nl="I",al="text-zinc-700"):(nl="—",al="text-zinc-600"),_.jsx("td",{className:`p-1.5 text-right tabular-nums ${al}`,children:nl},ul)})]})]},x.label)})})]})})}const Nr=[{key:"score",label:"Score",fmt:h=>h==null?"—":`${h.toFixed(1)}%`,higherBetter:!0},{key:"accuracy",label:"Accuracy",fmt:h=>h==null?"—":`${h.toFixed(1)}%`,higherBetter:!0},{key:"completeness",label:"Completeness",fmt:h=>h==null?"—":`${h.toFixed(1)}%`,higherBetter:!0},{key:"efficiency",label:"Efficiency",fmt:h=>h==null?"—":`${h.toFixed(1)}%`,higherBetter:!0},{key:"wasted",label:"Avg Wasted",fmt:h=>h==null?"—":h.toFixed(1),higherBetter:!1},{key:"speed",label:"Speed",fmt:h=>h==null?"—":`${h.toFixed(1)}s`,higherBetter:!1}];function Ho({va:h,vb:M,higherBetter:C}){if(h==null||M==null)return _.jsx("td",{className:"p-1.5 text-right text-zinc-600",children:"—"});const o=M-h,Y=(o>0?"+":"")+(Number.isInteger(o)?o:o.toFixed(1));let H="text-zinc-500";return o!==0&&(H=o>0===C?"text-emerald-400":"text-red-400"),_.jsx("td",{className:`p-1.5 text-right tabular-nums font-medium ${H}`,children:Y})}function Ur({a:h,b:M,scenarios:C,scenarioAbbrev:o,onSwap:q,onClear:Y}){const H=(R,p)=>p in R.scenarios?R.scenarios[p]:R[p]??null;return _.jsxs("div",{className:"mt-6 border border-zinc-800 rounded-lg p-4 max-w-2xl",children:[_.jsxs("div",{className:"flex items-center justify-between mb-3",children:[_.jsx("h3",{className:"text-sm font-semibold",children:"Compare"}),_.jsxs("div",{className:"flex gap-2",children:[_.jsx("button",{onClick:q,className:"text-xs px-2.5 py-1 rounded border border-zinc-700 hover:border-zinc-500 transition-colors",children:"Swap A↔B"}),_.jsx("button",{onClick:Y,className:"text-xs px-2.5 py-1 rounded border border-zinc-700 hover:border-red-500/50 hover:text-red-400 transition-colors",children:"Clear"})]})]}),_.jsxs("table",{className:"text-xs w-full border-collapse",children:[_.jsx("thead",{children:_.jsxs("tr",{className:"border-b border-zinc-800",children:[_.jsx("th",{className:"p-1.5 text-left text-zinc-500",children:"Metric"}),_.jsx("th",{className:"p-1.5 text-right text-zinc-400 max-w-48 truncate",title:h.label,children:h.label}),_.jsx("th",{className:"p-1.5 text-right text-zinc-500 w-16",children:"Delta"}),_.jsx("th",{className:"p-1.5 text-right text-zinc-400 max-w-48 truncate",title:M.label,children:M.label})]})}),_.jsxs("tbody",{children:[Nr.map(R=>{const p=H(h,R.key),E=H(M,R.key);return _.jsxs("tr",{className:"border-b border-zinc-900/50",children:[_.jsx("td",{className:"p-1.5 text-zinc-400",children:R.label}),_.jsx("td",{className:"p-1.5 text-right tabular-nums",children:R.fmt(p)}),_.jsx(Ho,{va:p,vb:E,higherBetter:R.higherBetter}),_.jsx("td",{className:"p-1.5 text-right tabular-nums",children:R.fmt(E)})]},R.key)}),_.jsx("tr",{children:_.jsx("td",{colSpan:4,className:"py-1",children:_.jsx("div",{className:"border-t border-zinc-800"})})}),C.map(R=>{const p=h.scenarios[R],E=M.scenarios[R],G=(U,x)=>U!=null?`${U}%`:(x.scenarioRuns?.[R]??0)===0?"I":"—";return _.jsxs("tr",{className:"border-b border-zinc-900/50",children:[_.jsx("td",{className:"p-1.5 text-zinc-500",children:o[R]||R}),_.jsx("td",{className:"p-1.5 text-right tabular-nums",children:G(p,h)}),_.jsx(Ho,{va:p,vb:E,higherBetter:!0}),_.jsx("td",{className:"p-1.5 text-right tabular-nums",children:G(E,M)})]},R)})]})]})]})}function Cr(h){const M={};for(const C of gf)M[C]=new Set(h.map(o=>o[C]));return M}function Rr(){const[h,M]=ml.useState(null),[C,o]=ml.useState(null),[q,Y]=ml.useState({col:"score",asc:!1}),[H,R]=ml.useState([]),[p,E]=ml.useState("reforged"),[G,U]=ml.useState("all"),[x,P]=ml.useState("all"),[k,vl]=ml.useState("all"),[ul,zl]=ml.useState(!1);ml.useEffect(()=>{Sr().then(g=>{M(g),o(Cr(g.rows))})},[]);const $=ml.useMemo(()=>h?ul?h.rows:h.rows.filter(g=>!g.retired):[],[h,ul]),nl=ml.useMemo(()=>h?h.rows.some(g=>g.retired):!1,[h]),al=ml.useMemo(()=>!h||!C?[]:Co($,p).filter(O=>gf.every(N=>!C[N]||C[N].has(O[N]))),[h,C,p,$]),{rows:Hl,scenarios:Tl}=ml.useMemo(()=>Er(al,h?.scenarios??[],x,k,h?.scenarioSuite??{}),[al,h,x,k]),W=ml.useMemo(()=>{if(!h)return{};const g=h.scenarioAbbrev,O=new Set(Tl),N={};for(const[X,K]of Object.entries(g))O.has(X)&&(N[X]=K);return N},[h,Tl]),Bl=ml.useMemo(()=>hf.find(g=>g.id===G)??hf[0],[G]),{sorted:Kl,groups:$l}=ml.useMemo(()=>pr(Hl,Bl,q,Tl,p),[Hl,Bl,q,Tl,p]),kl=ml.useMemo(()=>al.reduce((g,O)=>g+O.n*Tl.length,0),[al,Tl]),fl=ml.useCallback((g,O,N)=>{o(X=>{if(!X)return X;const K={...X,[g]:new Set(X[g])};return N?K[g].add(O):K[g].delete(O),K}),R([])},[]),Rt=ml.useCallback(g=>{E(g),R([])},[]),_t=ml.useCallback(g=>{U(g),R([])},[]),ut=ml.useCallback(g=>{P(g),R([])},[]),z=ml.useCallback(g=>{vl(g),R([])},[]),D=ml.useCallback(g=>{zl(g),R([])},[]),V=ml.useCallback(g=>{Y(O=>O.col===g?{col:g,asc:!O.asc}:{col:g,asc:g==="label"})},[]),dl=ml.useCallback((g,O)=>{R(N=>O?N.length>=2?[N[1],g]:[...N,g]:N.filter(X=>X!==g))},[]),yl=ml.useCallback(()=>{R(g=>[...g].reverse())},[]),d=ml.useCallback(()=>{R([])},[]);return!h||!C?_.jsx("div",{className:"flex items-center justify-center min-h-screen text-zinc-500",children:"Loading..."}):_.jsxs("div",{className:"flex min-h-screen",children:[_.jsx(xr,{rows:$,filters:C,onFilterChange:fl,activeScreen:p,onScreenChange:Rt,activeView:G,onViewChange:_t,scenarioScope:x,onScopeChange:ut,suiteScope:k,onSuiteChange:z,showRetired:ul,onShowRetiredChange:D,hasRetired:nl,filteredCount:al.length,totalCount:Co($,p).length,totalRuns:kl,timestamp:h.timestamp}),_.jsxs("main",{className:"flex-1 min-w-0 p-4 flex flex-col",children:[_.jsx(Dr,{rows:Kl,scenarios:Tl,scenarioAbbrev:W,sort:q,onSort:V,checked:H,onCompareToggle:dl,groups:$l,maxGen:h.maxGen??0,genInfo:h.genInfo}),H.length===2&&_.jsx(Ur,{a:Kl[H[0]],b:Kl[H[1]],scenarios:Tl,scenarioAbbrev:W,onSwap:yl,onClear:d}),_.jsxs("p",{className:"text-[0.6rem] text-zinc-600 mt-6",children:["Generated ",h.timestamp]})]})]})}rr.createRoot(document.getElementById("root")).render(_.jsx(ml.StrictMode,{children:_.jsx(Rr,{})}));</script>
     <style rel="stylesheet" crossorigin>@layer properties{@supports (((-webkit-hyphens:none)) and (not (margin-trim:inline))) or ((-moz-orient:inline) and (not (color:rgb(from red r g b)))){*,:before,:after,::backdrop{--tw-border-style:solid;--tw-font-weight:initial;--tw-tracking:initial;--tw-ordinal:initial;--tw-slashed-zero:initial;--tw-numeric-figure:initial;--tw-numeric-spacing:initial;--tw-numeric-fraction:initial;--tw-blur:initial;--tw-brightness:initial;--tw-contrast:initial;--tw-grayscale:initial;--tw-hue-rotate:initial;--tw-invert:initial;--tw-opacity:initial;--tw-saturate:initial;--tw-sepia:initial;--tw-drop-shadow:initial;--tw-drop-shadow-color:initial;--tw-drop-shadow-alpha:100%;--tw-drop-shadow-size:initial}}}@layer theme{:root,:host{--font-sans:ui-sans-serif, system-ui, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Noto Color Emoji";--font-mono:ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace;--color-red-400:oklch(70.4% .191 22.216);--color-red-500:oklch(63.7% .237 25.331);--color-orange-400:oklch(75% .183 55.934);--color-amber-400:oklch(82.8% .189 84.429);--color-emerald-300:oklch(84.5% .143 164.978);--color-emerald-400:oklch(76.5% .177 163.223);--color-emerald-500:oklch(69.6% .17 162.48);--color-zinc-100:oklch(96.7% .001 286.375);--color-zinc-200:oklch(92% .004 286.32);--color-zinc-300:oklch(87.1% .006 286.286);--color-zinc-400:oklch(70.5% .015 286.067);--color-zinc-500:oklch(55.2% .016 285.938);--color-zinc-600:oklch(44.2% .017 285.786);--color-zinc-700:oklch(37% .013 285.805);--color-zinc-800:oklch(27.4% .006 286.033);--color-zinc-900:oklch(21% .006 285.885);--color-zinc-950:oklch(14.1% .005 285.823);--spacing:.25rem;--container-2xl:42rem;--text-xs:.75rem;--text-xs--line-height:calc(1 / .75);--text-sm:.875rem;--text-sm--line-height:calc(1.25 / .875);--text-lg:1.125rem;--text-lg--line-height:calc(1.75 / 1.125);--font-weight-medium:500;--font-weight-semibold:600;--tracking-wider:.05em;--radius-lg:.5rem;--default-transition-duration:.15s;--default-transition-timing-function:cubic-bezier(.4, 0, .2, 1);--default-font-family:var(--font-sans);--default-mono-font-family:var(--font-mono)}}@layer base{*,:after,:before,::backdrop{box-sizing:border-box;border:0 solid;margin:0;padding:0}::file-selector-button{box-sizing:border-box;border:0 solid;margin:0;padding:0}html,:host{-webkit-text-size-adjust:100%;tab-size:4;line-height:1.5;font-family:var(--default-font-family,ui-sans-serif, system-ui, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Noto Color Emoji");font-feature-settings:var(--default-font-feature-settings,normal);font-variation-settings:var(--default-font-variation-settings,normal);-webkit-tap-highlight-color:transparent}hr{height:0;color:inherit;border-top-width:1px}abbr:where([title]){-webkit-text-decoration:underline dotted;text-decoration:underline dotted}h1,h2,h3,h4,h5,h6{font-size:inherit;font-weight:inherit}a{color:inherit;-webkit-text-decoration:inherit;text-decoration:inherit}b,strong{font-weight:bolder}code,kbd,samp,pre{font-family:var(--default-mono-font-family,ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace);font-feature-settings:var(--default-mono-font-feature-settings,normal);font-variation-settings:var(--default-mono-font-variation-settings,normal);font-size:1em}small{font-size:80%}sub,sup{vertical-align:baseline;font-size:75%;line-height:0;position:relative}sub{bottom:-.25em}sup{top:-.5em}table{text-indent:0;border-color:inherit;border-collapse:collapse}:-moz-focusring{outline:auto}progress{vertical-align:baseline}summary{display:list-item}ol,ul,menu{list-style:none}img,svg,video,canvas,audio,iframe,embed,object{vertical-align:middle;display:block}img,video{max-width:100%;height:auto}button,input,select,optgroup,textarea{font:inherit;font-feature-settings:inherit;font-variation-settings:inherit;letter-spacing:inherit;color:inherit;opacity:1;background-color:#0000;border-radius:0}::file-selector-button{font:inherit;font-feature-settings:inherit;font-variation-settings:inherit;letter-spacing:inherit;color:inherit;opacity:1;background-color:#0000;border-radius:0}:where(select:is([multiple],[size])) optgroup{font-weight:bolder}:where(select:is([multiple],[size])) optgroup option{padding-inline-start:20px}::file-selector-button{margin-inline-end:4px}::placeholder{opacity:1}@supports (not ((-webkit-appearance:-apple-pay-button))) or (contain-intrinsic-size:1px){::placeholder{color:currentColor}@supports (color:color-mix(in lab,red,red)){::placeholder{color:color-mix(in oklab,currentcolor 50%,transparent)}}}textarea{resize:vertical}::-webkit-search-decoration{-webkit-appearance:none}::-webkit-date-and-time-value{min-height:1lh;text-align:inherit}::-webkit-datetime-edit{display:inline-flex}::-webkit-datetime-edit-fields-wrapper{padding:0}::-webkit-datetime-edit{padding-block:0}::-webkit-datetime-edit-year-field{padding-block:0}::-webkit-datetime-edit-month-field{padding-block:0}::-webkit-datetime-edit-day-field{padding-block:0}::-webkit-datetime-edit-hour-field{padding-block:0}::-webkit-datetime-edit-minute-field{padding-block:0}::-webkit-datetime-edit-second-field{padding-block:0}::-webkit-datetime-edit-millisecond-field{padding-block:0}::-webkit-datetime-edit-meridiem-field{padding-block:0}::-webkit-calendar-picker-indicator{line-height:1}:-moz-ui-invalid{box-shadow:none}button,input:where([type=button],[type=reset],[type=submit]){appearance:button}::file-selector-button{appearance:button}::-webkit-inner-spin-button{height:auto}::-webkit-outer-spin-button{height:auto}[hidden]:where(:not([hidden=until-found])){display:none!important}}@layer components;@layer utilities{.sticky{position:sticky}.top-0{top:calc(var(--spacing) * 0)}.left-0{left:calc(var(--spacing) * 0)}.z-10{z-index:10}.mt-2{margin-top:calc(var(--spacing) * 2)}.mt-4{margin-top:calc(var(--spacing) * 4)}.mt-6{margin-top:calc(var(--spacing) * 6)}.mb-0\.5{margin-bottom:calc(var(--spacing) * .5)}.mb-1{margin-bottom:calc(var(--spacing) * 1)}.mb-3{margin-bottom:calc(var(--spacing) * 3)}.mb-4{margin-bottom:calc(var(--spacing) * 4)}.ml-0\.5{margin-left:calc(var(--spacing) * .5)}.ml-1\.5{margin-left:calc(var(--spacing) * 1.5)}.flex{display:flex}.hidden{display:none}.table{display:table}.h-3\.5{height:calc(var(--spacing) * 3.5)}.h-screen{height:100vh}.min-h-screen{min-height:100vh}.w-3\.5{width:calc(var(--spacing) * 3.5)}.w-8{width:calc(var(--spacing) * 8)}.w-16{width:calc(var(--spacing) * 16)}.w-52{width:calc(var(--spacing) * 52)}.w-full{width:100%}.max-w-2xl{max-width:var(--container-2xl)}.max-w-48{max-width:calc(var(--spacing) * 48)}.min-w-0{min-width:calc(var(--spacing) * 0)}.min-w-52{min-width:calc(var(--spacing) * 52)}.flex-1{flex:1}.shrink-0{flex-shrink:0}.border-collapse{border-collapse:collapse}.cursor-pointer{cursor:pointer}.flex-col{flex-direction:column}.flex-wrap{flex-wrap:wrap}.items-center{align-items:center}.justify-between{justify-content:space-between}.justify-center{justify-content:center}.gap-1{gap:calc(var(--spacing) * 1)}.gap-1\.5{gap:calc(var(--spacing) * 1.5)}.gap-2{gap:calc(var(--spacing) * 2)}.truncate{text-overflow:ellipsis;white-space:nowrap;overflow:hidden}.overflow-hidden{overflow:hidden}.overflow-x-auto{overflow-x:auto}.overflow-y-auto{overflow-y:auto}.rounded{border-radius:.25rem}.rounded-full{border-radius:3.40282e38px}.rounded-lg{border-radius:var(--radius-lg)}.border{border-style:var(--tw-border-style);border-width:1px}.border-t{border-top-style:var(--tw-border-style);border-top-width:1px}.border-r{border-right-style:var(--tw-border-style);border-right-width:1px}.border-b{border-bottom-style:var(--tw-border-style);border-bottom-width:1px}.border-emerald-500{border-color:var(--color-emerald-500)}.border-zinc-600{border-color:var(--color-zinc-600)}.border-zinc-700{border-color:var(--color-zinc-700)}.border-zinc-700\/50{border-color:#3f3f4680}@supports (color:color-mix(in lab,red,red)){.border-zinc-700\/50{border-color:color-mix(in oklab,var(--color-zinc-700) 50%,transparent)}}.border-zinc-800{border-color:var(--color-zinc-800)}.border-zinc-900{border-color:var(--color-zinc-900)}.border-zinc-900\/50{border-color:#18181b80}@supports (color:color-mix(in lab,red,red)){.border-zinc-900\/50{border-color:color-mix(in oklab,var(--color-zinc-900) 50%,transparent)}}.bg-emerald-500\/15{background-color:#00bb7f26}@supports (color:color-mix(in lab,red,red)){.bg-emerald-500\/15{background-color:color-mix(in oklab,var(--color-emerald-500) 15%,transparent)}}.bg-emerald-500\/20{background-color:#00bb7f33}@supports (color:color-mix(in lab,red,red)){.bg-emerald-500\/20{background-color:color-mix(in oklab,var(--color-emerald-500) 20%,transparent)}}.bg-zinc-800{background-color:var(--color-zinc-800)}.bg-zinc-800\/40{background-color:#27272a66}@supports (color:color-mix(in lab,red,red)){.bg-zinc-800\/40{background-color:color-mix(in oklab,var(--color-zinc-800) 40%,transparent)}}.bg-zinc-900\/30{background-color:#18181b4d}@supports (color:color-mix(in lab,red,red)){.bg-zinc-900\/30{background-color:color-mix(in oklab,var(--color-zinc-900) 30%,transparent)}}.bg-zinc-900\/40{background-color:#18181b66}@supports (color:color-mix(in lab,red,red)){.bg-zinc-900\/40{background-color:color-mix(in oklab,var(--color-zinc-900) 40%,transparent)}}.bg-zinc-950{background-color:var(--color-zinc-950)}.bg-zinc-950\/80{background-color:#09090bcc}@supports (color:color-mix(in lab,red,red)){.bg-zinc-950\/80{background-color:color-mix(in oklab,var(--color-zinc-950) 80%,transparent)}}.p-1\.5{padding:calc(var(--spacing) * 1.5)}.p-2{padding:calc(var(--spacing) * 2)}.p-4{padding:calc(var(--spacing) * 4)}.px-1{padding-inline:calc(var(--spacing) * 1)}.px-2{padding-inline:calc(var(--spacing) * 2)}.px-2\.5{padding-inline:calc(var(--spacing) * 2.5)}.py-0\.5{padding-block:calc(var(--spacing) * .5)}.py-1{padding-block:calc(var(--spacing) * 1)}.py-1\.5{padding-block:calc(var(--spacing) * 1.5)}.text-center{text-align:center}.text-left{text-align:left}.text-right{text-align:right}.align-middle{vertical-align:middle}.font-mono{font-family:var(--font-mono)}.text-lg{font-size:var(--text-lg);line-height:var(--tw-leading,var(--text-lg--line-height))}.text-sm{font-size:var(--text-sm);line-height:var(--tw-leading,var(--text-sm--line-height))}.text-xs{font-size:var(--text-xs);line-height:var(--tw-leading,var(--text-xs--line-height))}.text-\[0\.6rem\]{font-size:.6rem}.text-\[0\.55rem\]{font-size:.55rem}.text-\[0\.65rem\]{font-size:.65rem}.font-medium{--tw-font-weight:var(--font-weight-medium);font-weight:var(--font-weight-medium)}.font-semibold{--tw-font-weight:var(--font-weight-semibold);font-weight:var(--font-weight-semibold)}.tracking-wider{--tw-tracking:var(--tracking-wider);letter-spacing:var(--tracking-wider)}.whitespace-nowrap{white-space:nowrap}.text-amber-400{color:var(--color-amber-400)}.text-emerald-300{color:var(--color-emerald-300)}.text-emerald-400{color:var(--color-emerald-400)}.text-emerald-500\/80{color:#00bb7fcc}@supports (color:color-mix(in lab,red,red)){.text-emerald-500\/80{color:color-mix(in oklab,var(--color-emerald-500) 80%,transparent)}}.text-orange-400{color:var(--color-orange-400)}.text-red-400{color:var(--color-red-400)}.text-zinc-100{color:var(--color-zinc-100)}.text-zinc-400{color:var(--color-zinc-400)}.text-zinc-500{color:var(--color-zinc-500)}.text-zinc-600{color:var(--color-zinc-600)}.text-zinc-700{color:var(--color-zinc-700)}.uppercase{text-transform:uppercase}.tabular-nums{--tw-numeric-spacing:tabular-nums;font-variant-numeric:var(--tw-ordinal,) var(--tw-slashed-zero,) var(--tw-numeric-figure,) var(--tw-numeric-spacing,) var(--tw-numeric-fraction,)}.accent-emerald-500{accent-color:var(--color-emerald-500)}.opacity-60{opacity:.6}.filter{filter:var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,)}.transition-colors{transition-property:color,background-color,border-color,outline-color,text-decoration-color,fill,stroke,--tw-gradient-from,--tw-gradient-via,--tw-gradient-to;transition-timing-function:var(--tw-ease,var(--default-transition-timing-function));transition-duration:var(--tw-duration,var(--default-transition-duration))}.select-none{-webkit-user-select:none;user-select:none}@media(hover:hover){.hover\:border-red-500\/50:hover{border-color:#fb2c3680}@supports (color:color-mix(in lab,red,red)){.hover\:border-red-500\/50:hover{border-color:color-mix(in oklab,var(--color-red-500) 50%,transparent)}}.hover\:border-zinc-500:hover{border-color:var(--color-zinc-500)}.hover\:bg-zinc-900\/50:hover{background-color:#18181b80}@supports (color:color-mix(in lab,red,red)){.hover\:bg-zinc-900\/50:hover{background-color:color-mix(in oklab,var(--color-zinc-900) 50%,transparent)}}.hover\:bg-zinc-900\/70:hover{background-color:#18181bb3}@supports (color:color-mix(in lab,red,red)){.hover\:bg-zinc-900\/70:hover{background-color:color-mix(in oklab,var(--color-zinc-900) 70%,transparent)}}.hover\:text-emerald-400:hover{color:var(--color-emerald-400)}.hover\:text-red-400:hover{color:var(--color-red-400)}.hover\:text-zinc-200:hover{color:var(--color-zinc-200)}.hover\:text-zinc-300:hover{color:var(--color-zinc-300)}}}@property --tw-border-style{syntax:"*";inherits:false;initial-value:solid}@property --tw-font-weight{syntax:"*";inherits:false}@property --tw-tracking{syntax:"*";inherits:false}@property --tw-ordinal{syntax:"*";inherits:false}@property --tw-slashed-zero{syntax:"*";inherits:false}@property --tw-numeric-figure{syntax:"*";inherits:false}@property --tw-numeric-spacing{syntax:"*";inherits:false}@property --tw-numeric-fraction{syntax:"*";inherits:false}@property --tw-blur{syntax:"*";inherits:false}@property --tw-brightness{syntax:"*";inherits:false}@property --tw-contrast{syntax:"*";inherits:false}@property --tw-grayscale{syntax:"*";inherits:false}@property --tw-hue-rotate{syntax:"*";inherits:false}@property --tw-invert{syntax:"*";inherits:false}@property --tw-opacity{syntax:"*";inherits:false}@property --tw-saturate{syntax:"*";inherits:false}@property --tw-sepia{syntax:"*";inherits:false}@property --tw-drop-shadow{syntax:"*";inherits:false}@property --tw-drop-shadow-color{syntax:"*";inherits:false}@property --tw-drop-shadow-alpha{syntax:"<percentage>";inherits:false;initial-value:100%}@property --tw-drop-shadow-size{syntax:"*";inherits:false}</style>
-  <script>window.__FORGE_DATA__ = {"rows": [{"label": "claude-opus-4-6 AN/N [reforged]", "model": "claude-opus-4-6", "backend": "anthropic", "mode": "native", "ablation": "reforged", "family": "claude", "quant": "n/a", "gen": 1, "retired": false, "score": 99.2, "accuracy": 99.8, "completeness": 99.4, "efficiency": 100.0, "wasted": 0.0, "speed": 15.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 98, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 100, "argument_transformation": 98, "grounded_synthesis": 94, "inconsistent_api_recovery": 98, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 96, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 98, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 98}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 47, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 196, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 400, "argument_transformation": 245, "grounded_synthesis": 470, "inconsistent_api_recovery": 392, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 245, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 392}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 98, "sequential_reasoning": 200, "error_recovery": 150, "data_gap_recovery": 150, "data_gap_recovery_extended": 150, "argument_transformation": 147, "grounded_synthesis": 141, "inconsistent_api_recovery": 349, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 151, "argument_transformation_stateful": 147, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 358}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 139.05, "argument_fidelity": 363.67, "tool_selection": 326.01, "basic_2step": 331.58, "sequential_3step": 446.21, "conditional_routing": 945.05, "sequential_reasoning": 464.91, "error_recovery": 308.01, "data_gap_recovery": 1029.22, "data_gap_recovery_extended": 1009.33, "argument_transformation": 1013.19, "grounded_synthesis": 1892.46, "inconsistent_api_recovery": 1518.48, "relevance_detection_stateful": 125.53, "argument_fidelity_stateful": 1096.25, "tool_selection_stateful": 454.47, "basic_2step_stateful": 226.33, "sequential_3step_stateful": 476.66, "conditional_routing_stateful": 701.36, "sequential_reasoning_stateful": 851.35, "error_recovery_stateful": 874.25, "data_gap_recovery_stateful": 639.94, "data_gap_recovery_extended_stateful": 891.99, "argument_transformation_stateful": 1022.41, "grounded_synthesis_stateful": 1857.26, "inconsistent_api_recovery_stateful": 1176.68}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "claude-sonnet-4-6 AN/N [reforged]", "model": "claude-sonnet-4-6", "backend": "anthropic", "mode": "native", "ablation": "reforged", "family": "claude", "quant": "n/a", "gen": 1, "retired": false, "score": 98.4, "accuracy": 98.5, "completeness": 99.9, "efficiency": 100.0, "wasted": 0.1, "speed": 13.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 98, "argument_transformation": 74, "grounded_synthesis": 98, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 88, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 37, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 392, "argument_transformation": 185, "grounded_synthesis": 490, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 220, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 165, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 100, "sequential_reasoning": 200, "error_recovery": 150, "data_gap_recovery": 150, "data_gap_recovery_extended": 169, "argument_transformation": 111, "grounded_synthesis": 149, "inconsistent_api_recovery": 308, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 175, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 201, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 163, "argument_transformation_stateful": 132, "grounded_synthesis_stateful": 152, "inconsistent_api_recovery_stateful": 300}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 15.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 25.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 1.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 136.13, "argument_fidelity": 403.65, "tool_selection": 469.63, "basic_2step": 167.39, "sequential_3step": 396.81, "conditional_routing": 610.88, "sequential_reasoning": 626.4, "error_recovery": 359.33, "data_gap_recovery": 700.2, "data_gap_recovery_extended": 929.08, "argument_transformation": 950.38, "grounded_synthesis": 1533.37, "inconsistent_api_recovery": 1199.55, "relevance_detection_stateful": 132.7, "argument_fidelity_stateful": 422.21, "tool_selection_stateful": 457.96, "basic_2step_stateful": 183.83, "sequential_3step_stateful": 270.31, "conditional_routing_stateful": 607.73, "sequential_reasoning_stateful": 632.04, "error_recovery_stateful": 368.27, "data_gap_recovery_stateful": 716.62, "data_gap_recovery_extended_stateful": 899.79, "argument_transformation_stateful": 1031.82, "grounded_synthesis_stateful": 1647.03, "inconsistent_api_recovery_stateful": 1205.45}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-haiku-4-5-20251001 AN/N [reforged]", "model": "claude-haiku-4-5-20251001", "backend": "anthropic", "mode": "native", "ablation": "reforged", "family": "claude", "quant": "n/a", "gen": 1, "retired": false, "score": 94.5, "accuracy": 94.9, "completeness": 99.6, "efficiency": 100.0, "wasted": 0.3, "speed": 8.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 80, "argument_transformation": 80, "grounded_synthesis": 98, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 94, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 76, "argument_transformation_stateful": 36, "grounded_synthesis_stateful": 94, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 40, "argument_transformation": 40, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 18, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 320, "argument_transformation": 200, "grounded_synthesis": 490, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 304, "argument_transformation_stateful": 90, "grounded_synthesis_stateful": 470, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 104, "argument_fidelity": 156, "tool_selection": 150, "basic_2step": 152, "sequential_3step": 150, "conditional_routing": 100, "sequential_reasoning": 264, "error_recovery": 150, "data_gap_recovery": 200, "data_gap_recovery_extended": 199, "argument_transformation": 134, "grounded_synthesis": 169, "inconsistent_api_recovery": 363, "relevance_detection_stateful": 103, "argument_fidelity_stateful": 153, "tool_selection_stateful": 151, "basic_2step_stateful": 153, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 272, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 199, "data_gap_recovery_extended_stateful": 192, "argument_transformation_stateful": 60, "grounded_synthesis_stateful": 154, "inconsistent_api_recovery_stateful": 353}, "scenarioWastedSum": {"relevance_detection": 54.0, "argument_fidelity": 6.0, "tool_selection": 0.0, "basic_2step": 52.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 64.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 1.0, "relevance_detection_stateful": 53.0, "argument_fidelity_stateful": 3.0, "tool_selection_stateful": 1.0, "basic_2step_stateful": 53.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 84.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 3.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 312.02, "argument_fidelity": 180.11, "tool_selection": 524.95, "basic_2step": 170.11, "sequential_3step": 182.72, "conditional_routing": 308.02, "sequential_reasoning": 429.51, "error_recovery": 160.62, "data_gap_recovery": 515.95, "data_gap_recovery_extended": 527.98, "argument_transformation": 431.47, "grounded_synthesis": 804.65, "inconsistent_api_recovery": 730.53, "relevance_detection_stateful": 332.69, "argument_fidelity_stateful": 234.48, "tool_selection_stateful": 259.69, "basic_2step_stateful": 236.42, "sequential_3step_stateful": 178.39, "conditional_routing_stateful": 296.18, "sequential_reasoning_stateful": 405.51, "error_recovery_stateful": 256.43, "data_gap_recovery_stateful": 504.55, "data_gap_recovery_extended_stateful": 676.16, "argument_transformation_stateful": 680.29, "grounded_synthesis_stateful": 822.73, "inconsistent_api_recovery_stateful": 804.88}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.6-35B-A3B-UD-Q4_K_M LS/N [reforged]", "model": "Qwen3.6-35B-A3B-UD-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "qwen3.6-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 94.8, "accuracy": 95.1, "completeness": 99.7, "efficiency": 100.0, "wasted": 0.6, "speed": 12.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 96, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 72, "argument_transformation": 78, "grounded_synthesis": 92, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 98, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 68, "argument_transformation_stateful": 76, "grounded_synthesis_stateful": 94, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 48, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 36, "argument_transformation": 39, "grounded_synthesis": 46, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 38, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 192, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 288, "argument_transformation": 195, "grounded_synthesis": 460, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 272, "argument_transformation_stateful": 190, "grounded_synthesis_stateful": 470, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 94, "argument_fidelity": 164, "tool_selection": 214, "basic_2step": 107, "sequential_3step": 186, "conditional_routing": 139, "sequential_reasoning": 249, "error_recovery": 167, "data_gap_recovery": 172, "data_gap_recovery_extended": 152, "argument_transformation": 186, "grounded_synthesis": 390, "inconsistent_api_recovery": 372, "relevance_detection_stateful": 88, "argument_fidelity_stateful": 168, "tool_selection_stateful": 222, "basic_2step_stateful": 106, "sequential_3step_stateful": 192, "conditional_routing_stateful": 146, "sequential_reasoning_stateful": 222, "error_recovery_stateful": 158, "data_gap_recovery_stateful": 178, "data_gap_recovery_extended_stateful": 152, "argument_transformation_stateful": 188, "grounded_synthesis_stateful": 398, "inconsistent_api_recovery_stateful": 385}, "scenarioWastedSum": {"relevance_detection": 44.0, "argument_fidelity": 14.0, "tool_selection": 64.0, "basic_2step": 7.0, "sequential_3step": 36.0, "conditional_routing": 7.0, "sequential_reasoning": 59.0, "error_recovery": 67.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 19.0, "grounded_synthesis": 110.0, "inconsistent_api_recovery": 12.0, "relevance_detection_stateful": 39.0, "argument_fidelity_stateful": 18.0, "tool_selection_stateful": 72.0, "basic_2step_stateful": 6.0, "sequential_3step_stateful": 42.0, "conditional_routing_stateful": 12.0, "sequential_reasoning_stateful": 41.0, "error_recovery_stateful": 8.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 27.0, "grounded_synthesis_stateful": 96.0, "inconsistent_api_recovery_stateful": 20.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 508.07, "argument_fidelity": 198.63, "tool_selection": 433.45, "basic_2step": 141.72, "sequential_3step": 273.79, "conditional_routing": 470.21, "sequential_reasoning": 420.81, "error_recovery": 620.19, "data_gap_recovery": 518.6, "data_gap_recovery_extended": 540.97, "argument_transformation": 1808.06, "grounded_synthesis": 1506.48, "inconsistent_api_recovery": 990.66, "relevance_detection_stateful": 499.19, "argument_fidelity_stateful": 202.09, "tool_selection_stateful": 381.19, "basic_2step_stateful": 138.95, "sequential_3step_stateful": 276.05, "conditional_routing_stateful": 540.46, "sequential_reasoning_stateful": 370.69, "error_recovery_stateful": 355.01, "data_gap_recovery_stateful": 505.92, "data_gap_recovery_extended_stateful": 570.54, "argument_transformation_stateful": 1824.15, "grounded_synthesis_stateful": 1422.5, "inconsistent_api_recovery_stateful": 952.6}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-27B-Q4_K_M LS/N [reforged]", "model": "Qwen3.5-27B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "qwen3.5-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 93.2, "accuracy": 93.3, "completeness": 99.8, "efficiency": 82.3, "wasted": 1.4, "speed": 37.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 98, "data_gap_recovery": 100, "data_gap_recovery_extended": 74, "argument_transformation": 38, "grounded_synthesis": 88, "inconsistent_api_recovery": 98, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 78, "argument_transformation_stateful": 56, "grounded_synthesis_stateful": 96, "inconsistent_api_recovery_stateful": 98}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 37, "argument_transformation": 19, "grounded_synthesis": 44, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 28, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 49}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 98, "data_gap_recovery": 250, "data_gap_recovery_extended": 296, "argument_transformation": 95, "grounded_synthesis": 440, "inconsistent_api_recovery": 392, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 312, "argument_transformation_stateful": 140, "grounded_synthesis_stateful": 480, "inconsistent_api_recovery_stateful": 392}, "scenarioActualCalls": {"relevance_detection": 53, "argument_fidelity": 254, "tool_selection": 191, "basic_2step": 135, "sequential_3step": 219, "conditional_routing": 255, "sequential_reasoning": 309, "error_recovery": 201, "data_gap_recovery": 310, "data_gap_recovery_extended": 248, "argument_transformation": 112, "grounded_synthesis": 238, "inconsistent_api_recovery": 608, "relevance_detection_stateful": 54, "argument_fidelity_stateful": 266, "tool_selection_stateful": 204, "basic_2step_stateful": 133, "sequential_3step_stateful": 234, "conditional_routing_stateful": 242, "sequential_reasoning_stateful": 315, "error_recovery_stateful": 220, "data_gap_recovery_stateful": 316, "data_gap_recovery_extended_stateful": 270, "argument_transformation_stateful": 172, "grounded_synthesis_stateful": 285, "inconsistent_api_recovery_stateful": 583}, "scenarioWastedSum": {"relevance_detection": 3.0, "argument_fidelity": 104.0, "tool_selection": 41.0, "basic_2step": 35.0, "sequential_3step": 69.0, "conditional_routing": 89.0, "sequential_reasoning": 109.0, "error_recovery": 109.0, "data_gap_recovery": 73.0, "data_gap_recovery_extended": 17.0, "argument_transformation": 45.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 216.0, "relevance_detection_stateful": 4.0, "argument_fidelity_stateful": 116.0, "tool_selection_stateful": 54.0, "basic_2step_stateful": 33.0, "sequential_3step_stateful": 84.0, "conditional_routing_stateful": 97.0, "sequential_reasoning_stateful": 115.0, "error_recovery_stateful": 70.0, "data_gap_recovery_stateful": 79.0, "data_gap_recovery_extended_stateful": 11.0, "argument_transformation_stateful": 48.0, "grounded_synthesis_stateful": 6.0, "inconsistent_api_recovery_stateful": 191.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 279.82, "argument_fidelity": 772.38, "tool_selection": 460.93, "basic_2step": 311.2, "sequential_3step": 653.05, "conditional_routing": 1843.5, "sequential_reasoning": 981.56, "error_recovery": 766.11, "data_gap_recovery": 2186.91, "data_gap_recovery_extended": 2653.6, "argument_transformation": 5412.58, "grounded_synthesis": 3722.24, "inconsistent_api_recovery": 4409.74, "relevance_detection_stateful": 256.35, "argument_fidelity_stateful": 789.72, "tool_selection_stateful": 472.2, "basic_2step_stateful": 317.74, "sequential_3step_stateful": 693.31, "conditional_routing_stateful": 1866.41, "sequential_reasoning_stateful": 948.52, "error_recovery_stateful": 704.44, "data_gap_recovery_stateful": 2160.24, "data_gap_recovery_extended_stateful": 2741.61, "argument_transformation_stateful": 5406.31, "grounded_synthesis_stateful": 3714.94, "inconsistent_api_recovery_stateful": 4325.4}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "Qwen3.6-27B-Q4_K_M LS/N [reforged]", "model": "Qwen3.6-27B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "qwen3.6-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 92.2, "accuracy": 92.5, "completeness": 99.6, "efficiency": 100.0, "wasted": 0.4, "speed": 37.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 98, "data_gap_recovery": 100, "data_gap_recovery_extended": 22, "argument_transformation": 74, "grounded_synthesis": 98, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 96, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 78, "grounded_synthesis_stateful": 96, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 11, "argument_transformation": 37, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 18, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 98, "data_gap_recovery": 250, "data_gap_recovery_extended": 88, "argument_transformation": 185, "grounded_synthesis": 490, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 144, "argument_transformation_stateful": 195, "grounded_synthesis_stateful": 480, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 66, "argument_fidelity": 155, "tool_selection": 161, "basic_2step": 108, "sequential_3step": 187, "conditional_routing": 190, "sequential_reasoning": 225, "error_recovery": 150, "data_gap_recovery": 186, "data_gap_recovery_extended": 48, "argument_transformation": 196, "grounded_synthesis": 194, "inconsistent_api_recovery": 419, "relevance_detection_stateful": 66, "argument_fidelity_stateful": 157, "tool_selection_stateful": 165, "basic_2step_stateful": 116, "sequential_3step_stateful": 183, "conditional_routing_stateful": 208, "sequential_reasoning_stateful": 223, "error_recovery_stateful": 146, "data_gap_recovery_stateful": 178, "data_gap_recovery_extended_stateful": 72, "argument_transformation_stateful": 197, "grounded_synthesis_stateful": 173, "inconsistent_api_recovery_stateful": 421}, "scenarioWastedSum": {"relevance_detection": 16.0, "argument_fidelity": 5.0, "tool_selection": 11.0, "basic_2step": 8.0, "sequential_3step": 37.0, "conditional_routing": 28.0, "sequential_reasoning": 25.0, "error_recovery": 52.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 32.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 36.0, "relevance_detection_stateful": 16.0, "argument_fidelity_stateful": 7.0, "tool_selection_stateful": 15.0, "basic_2step_stateful": 16.0, "sequential_3step_stateful": 33.0, "conditional_routing_stateful": 44.0, "sequential_reasoning_stateful": 23.0, "error_recovery_stateful": 2.0, "data_gap_recovery_stateful": 4.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 14.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 40.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 1708.49, "argument_fidelity": 577.79, "tool_selection": 455.19, "basic_2step": 362.6, "sequential_3step": 660.55, "conditional_routing": 1717.96, "sequential_reasoning": 1053.64, "error_recovery": 1021.89, "data_gap_recovery": 1994.37, "data_gap_recovery_extended": 1617.47, "argument_transformation": 5984.42, "grounded_synthesis": 3233.57, "inconsistent_api_recovery": 3652.94, "relevance_detection_stateful": 1602.1, "argument_fidelity_stateful": 580.54, "tool_selection_stateful": 435.14, "basic_2step_stateful": 600.39, "sequential_3step_stateful": 718.76, "conditional_routing_stateful": 1738.46, "sequential_reasoning_stateful": 1034.03, "error_recovery_stateful": 968.92, "data_gap_recovery_stateful": 1877.28, "data_gap_recovery_extended_stateful": 1731.45, "argument_transformation_stateful": 6709.6, "grounded_synthesis_stateful": 3147.67, "inconsistent_api_recovery_stateful": 3907.64}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-35B-A3B-Q4_K_M LS/N [reforged]", "model": "Qwen3.5-35B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "qwen3.5-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 92.1, "accuracy": 92.4, "completeness": 99.7, "efficiency": 82.1, "wasted": 1.3, "speed": 11.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 98, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 96, "argument_transformation": 14, "grounded_synthesis": 84, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 94, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 96, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 7, "grounded_synthesis": 42, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 196, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 384, "argument_transformation": 35, "grounded_synthesis": 420, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 376, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 480, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 61, "argument_fidelity": 246, "tool_selection": 244, "basic_2step": 172, "sequential_3step": 215, "conditional_routing": 298, "sequential_reasoning": 317, "error_recovery": 178, "data_gap_recovery": 290, "data_gap_recovery_extended": 320, "argument_transformation": 43, "grounded_synthesis": 311, "inconsistent_api_recovery": 477, "relevance_detection_stateful": 59, "argument_fidelity_stateful": 251, "tool_selection_stateful": 236, "basic_2step_stateful": 193, "sequential_3step_stateful": 243, "conditional_routing_stateful": 308, "sequential_reasoning_stateful": 335, "error_recovery_stateful": 175, "data_gap_recovery_stateful": 277, "data_gap_recovery_extended_stateful": 310, "argument_transformation_stateful": 63, "grounded_synthesis_stateful": 334, "inconsistent_api_recovery_stateful": 469}, "scenarioWastedSum": {"relevance_detection": 11.0, "argument_fidelity": 96.0, "tool_selection": 94.0, "basic_2step": 72.0, "sequential_3step": 65.0, "conditional_routing": 122.0, "sequential_reasoning": 121.0, "error_recovery": 78.0, "data_gap_recovery": 54.0, "data_gap_recovery_extended": 16.0, "argument_transformation": 32.0, "grounded_synthesis": 6.0, "inconsistent_api_recovery": 84.0, "relevance_detection_stateful": 9.0, "argument_fidelity_stateful": 101.0, "tool_selection_stateful": 86.0, "basic_2step_stateful": 93.0, "sequential_3step_stateful": 93.0, "conditional_routing_stateful": 140.0, "sequential_reasoning_stateful": 135.0, "error_recovery_stateful": 25.0, "data_gap_recovery_stateful": 50.0, "data_gap_recovery_extended_stateful": 22.0, "argument_transformation_stateful": 39.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 71.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 91.81, "argument_fidelity": 239.08, "tool_selection": 171.19, "basic_2step": 125.25, "sequential_3step": 211.5, "conditional_routing": 563.1, "sequential_reasoning": 329.57, "error_recovery": 185.75, "data_gap_recovery": 613.21, "data_gap_recovery_extended": 719.75, "argument_transformation": 1689.59, "grounded_synthesis": 1101.91, "inconsistent_api_recovery": 1167.15, "relevance_detection_stateful": 84.33, "argument_fidelity_stateful": 248.12, "tool_selection_stateful": 167.56, "basic_2step_stateful": 140.99, "sequential_3step_stateful": 241.01, "conditional_routing_stateful": 589.17, "sequential_reasoning_stateful": 349.54, "error_recovery_stateful": 185.09, "data_gap_recovery_stateful": 619.16, "data_gap_recovery_extended_stateful": 698.7, "argument_transformation_stateful": 1629.1, "grounded_synthesis_stateful": 1069.15, "inconsistent_api_recovery_stateful": 1150.96}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-opus-4-6 AN/N [bare]", "model": "claude-opus-4-6", "backend": "anthropic", "mode": "native", "ablation": "bare", "family": "claude", "quant": "n/a", "gen": 1, "retired": false, "score": 87.9, "accuracy": 95.8, "completeness": 91.8, "efficiency": 100.0, "wasted": 0.0, "speed": 16.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 98, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 100, "argument_transformation": 100, "grounded_synthesis": 100, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 98, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 96, "grounded_synthesis_stateful": 96, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 98, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 400, "argument_transformation": 250, "grounded_synthesis": 500, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 240, "grounded_synthesis_stateful": 480, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 98, "sequential_3step": 150, "conditional_routing": 100, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 151, "argument_transformation": 151, "grounded_synthesis": 150, "inconsistent_api_recovery": 246, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 147, "data_gap_recovery_extended_stateful": 150, "argument_transformation_stateful": 144, "grounded_synthesis_stateful": 144, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 123.04, "argument_fidelity": 442.53, "tool_selection": 489.76, "basic_2step": 438.17, "sequential_3step": 532.18, "conditional_routing": 614.81, "sequential_reasoning": 554.22, "error_recovery": 0.0, "data_gap_recovery": 769.41, "data_gap_recovery_extended": 902.93, "argument_transformation": 1036.02, "grounded_synthesis": 1839.76, "inconsistent_api_recovery": 1041.4, "relevance_detection_stateful": 130.52, "argument_fidelity_stateful": 405.75, "tool_selection_stateful": 334.24, "basic_2step_stateful": 598.51, "sequential_3step_stateful": 682.77, "conditional_routing_stateful": 691.1, "sequential_reasoning_stateful": 546.89, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 702.07, "data_gap_recovery_extended_stateful": 1094.57, "argument_transformation_stateful": 1544.11, "grounded_synthesis_stateful": 3220.45, "inconsistent_api_recovery_stateful": 911.51}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-27B-Q4_K_M LS/P [reforged]", "model": "Qwen3.5-27B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "qwen3.5-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 86.8, "accuracy": 86.8, "completeness": 100.0, "efficiency": 100.0, "wasted": 0.1, "speed": 24.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 42, "argument_transformation": 10, "grounded_synthesis": 78, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 80, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 21, "argument_transformation": 5, "grounded_synthesis": 39, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 18, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 168, "argument_transformation": 25, "grounded_synthesis": 390, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 144, "argument_transformation_stateful": 25, "grounded_synthesis_stateful": 400, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 194, "sequential_reasoning": 200, "error_recovery": 150, "data_gap_recovery": 234, "data_gap_recovery_extended": 86, "argument_transformation": 19, "grounded_synthesis": 232, "inconsistent_api_recovery": 315, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 234, "data_gap_recovery_extended_stateful": 73, "argument_transformation_stateful": 18, "grounded_synthesis_stateful": 232, "inconsistent_api_recovery_stateful": 322}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 24.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 1.0, "grounded_synthesis": 1.0, "inconsistent_api_recovery": 8.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 23.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 5.0, "grounded_synthesis_stateful": 3.0, "inconsistent_api_recovery_stateful": 12.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 221.73, "argument_fidelity": 485.12, "tool_selection": 342.97, "basic_2step": 274.45, "sequential_3step": 478.63, "conditional_routing": 1301.97, "sequential_reasoning": 667.1, "error_recovery": 616.97, "data_gap_recovery": 1507.24, "data_gap_recovery_extended": 1727.18, "argument_transformation": 2545.03, "grounded_synthesis": 2936.59, "inconsistent_api_recovery": 2370.72, "relevance_detection_stateful": 209.71, "argument_fidelity_stateful": 494.69, "tool_selection_stateful": 346.25, "basic_2step_stateful": 259.08, "sequential_3step_stateful": 528.54, "conditional_routing_stateful": 1334.27, "sequential_reasoning_stateful": 683.83, "error_recovery_stateful": 632.94, "data_gap_recovery_stateful": 1450.28, "data_gap_recovery_extended_stateful": 1656.15, "argument_transformation_stateful": 3058.14, "grounded_synthesis_stateful": 3179.33, "inconsistent_api_recovery_stateful": 2373.82}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-opus-4-6 AN/N [bare+any]", "model": "claude-opus-4-6", "backend": "anthropic", "mode": "native", "ablation": "bare", "family": "claude", "quant": "n/a", "gen": 1, "retired": false, "score": 87.1, "accuracy": 95.4, "completeness": 91.3, "efficiency": 100.0, "wasted": 0.0, "speed": 12.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 100, "argument_transformation": 80, "grounded_synthesis": 100, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 86, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 400, "argument_transformation": 200, "grounded_synthesis": 500, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 215, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 100, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 197, "argument_transformation": 219, "grounded_synthesis": 151, "inconsistent_api_recovery": 200, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 198, "argument_transformation_stateful": 241, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 21.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 26.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 260.72, "argument_fidelity": 477.63, "tool_selection": 418.17, "basic_2step": 200.47, "sequential_3step": 589.82, "conditional_routing": 581.21, "sequential_reasoning": 524.65, "error_recovery": 0.0, "data_gap_recovery": 757.27, "data_gap_recovery_extended": 1074.6, "argument_transformation": 779.87, "grounded_synthesis": 1425.95, "inconsistent_api_recovery": 708.76, "relevance_detection_stateful": 131.27, "argument_fidelity_stateful": 807.86, "tool_selection_stateful": 470.88, "basic_2step_stateful": 282.28, "sequential_3step_stateful": 468.06, "conditional_routing_stateful": 545.67, "sequential_reasoning_stateful": 541.4, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 613.07, "data_gap_recovery_extended_stateful": 878.83, "argument_transformation_stateful": 818.26, "grounded_synthesis_stateful": 1367.87, "inconsistent_api_recovery_stateful": 627.95}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "ministral-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 84.5, "accuracy": 84.5, "completeness": 100.0, "efficiency": 96.7, "wasted": 0.6, "speed": 5.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 88, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 70, "data_gap_recovery_extended": 44, "argument_transformation": 48, "grounded_synthesis": 76, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 76, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 26, "grounded_synthesis_stateful": 62, "inconsistent_api_recovery_stateful": 82}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 44, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 35, "data_gap_recovery_extended": 22, "argument_transformation": 24, "grounded_synthesis": 38, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 38, "data_gap_recovery_extended_stateful": 19, "argument_transformation_stateful": 13, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 41}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 176, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 175, "data_gap_recovery_extended": 176, "argument_transformation": 120, "grounded_synthesis": 380, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 190, "data_gap_recovery_extended_stateful": 152, "argument_transformation_stateful": 65, "grounded_synthesis_stateful": 310, "inconsistent_api_recovery_stateful": 328}, "scenarioActualCalls": {"relevance_detection": 104, "argument_fidelity": 150, "tool_selection": 166, "basic_2step": 111, "sequential_3step": 151, "conditional_routing": 199, "sequential_reasoning": 248, "error_recovery": 150, "data_gap_recovery": 184, "data_gap_recovery_extended": 124, "argument_transformation": 117, "grounded_synthesis": 204, "inconsistent_api_recovery": 489, "relevance_detection_stateful": 102, "argument_fidelity_stateful": 150, "tool_selection_stateful": 176, "basic_2step_stateful": 106, "sequential_3step_stateful": 154, "conditional_routing_stateful": 214, "sequential_reasoning_stateful": 242, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 192, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 65, "grounded_synthesis_stateful": 160, "inconsistent_api_recovery_stateful": 429}, "scenarioWastedSum": {"relevance_detection": 54.0, "argument_fidelity": 0.0, "tool_selection": 16.0, "basic_2step": 11.0, "sequential_3step": 1.0, "conditional_routing": 35.0, "sequential_reasoning": 48.0, "error_recovery": 50.0, "data_gap_recovery": 29.0, "data_gap_recovery_extended": 3.0, "argument_transformation": 25.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 129.0, "relevance_detection_stateful": 52.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 26.0, "basic_2step_stateful": 6.0, "sequential_3step_stateful": 4.0, "conditional_routing_stateful": 36.0, "sequential_reasoning_stateful": 47.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 38.0, "data_gap_recovery_extended_stateful": 9.0, "argument_transformation_stateful": 20.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 120.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 27.28, "argument_fidelity": 46.93, "tool_selection": 45.73, "basic_2step": 23.73, "sequential_3step": 60.7, "conditional_routing": 332.82, "sequential_reasoning": 199.27, "error_recovery": 31.84, "data_gap_recovery": 305.59, "data_gap_recovery_extended": 502.05, "argument_transformation": 720.14, "grounded_synthesis": 739.5, "inconsistent_api_recovery": 471.08, "relevance_detection_stateful": 25.56, "argument_fidelity_stateful": 46.68, "tool_selection_stateful": 49.91, "basic_2step_stateful": 26.81, "sequential_3step_stateful": 60.26, "conditional_routing_stateful": 334.34, "sequential_reasoning_stateful": 189.86, "error_recovery_stateful": 29.65, "data_gap_recovery_stateful": 291.02, "data_gap_recovery_extended_stateful": 439.9, "argument_transformation_stateful": 701.42, "grounded_synthesis_stateful": 801.66, "inconsistent_api_recovery_stateful": 483.21}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-sonnet-4-6 AN/N [bare]", "model": "claude-sonnet-4-6", "backend": "anthropic", "mode": "native", "ablation": "bare", "family": "claude", "quant": "n/a", "gen": 1, "retired": false, "score": 85.1, "accuracy": 95.0, "completeness": 89.5, "efficiency": 100.0, "wasted": 0.0, "speed": 14.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 68, "basic_2step": 98, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 98, "argument_transformation": 86, "grounded_synthesis": 98, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 66, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 98, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 34, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 43, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 33, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 34, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 33, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 34, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 33, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 102, "basic_2step": 98, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 392, "argument_transformation": 215, "grounded_synthesis": 490, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 99, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 245, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 102, "basic_2step": 98, "sequential_3step": 150, "conditional_routing": 103, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 170, "argument_transformation": 131, "grounded_synthesis": 149, "inconsistent_api_recovery": 204, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 99, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 166, "argument_transformation_stateful": 149, "grounded_synthesis_stateful": 151, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 34, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 33, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 149.4, "argument_fidelity": 477.3, "tool_selection": 289.4, "basic_2step": 230.01, "sequential_3step": 385.03, "conditional_routing": 606.67, "sequential_reasoning": 704.29, "error_recovery": 0.0, "data_gap_recovery": 831.9, "data_gap_recovery_extended": 969.89, "argument_transformation": 1306.94, "grounded_synthesis": 1810.36, "inconsistent_api_recovery": 945.83, "relevance_detection_stateful": 128.92, "argument_fidelity_stateful": 498.47, "tool_selection_stateful": 235.38, "basic_2step_stateful": 205.96, "sequential_3step_stateful": 413.16, "conditional_routing_stateful": 607.95, "sequential_reasoning_stateful": 635.96, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 740.76, "data_gap_recovery_extended_stateful": 965.31, "argument_transformation_stateful": 1054.59, "grounded_synthesis_stateful": 1613.78, "inconsistent_api_recovery_stateful": 895.57}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 34, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 33, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "ministral-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 84.2, "accuracy": 84.2, "completeness": 100.0, "efficiency": 95.4, "wasted": 0.5, "speed": 6.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 98, "data_gap_recovery_extended": 74, "argument_transformation": 26, "grounded_synthesis": 54, "inconsistent_api_recovery": 88, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 68, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 52}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 37, "argument_transformation": 13, "grounded_synthesis": 27, "inconsistent_api_recovery": 44, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 13, "inconsistent_api_recovery_stateful": 26}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 245, "data_gap_recovery_extended": 296, "argument_transformation": 65, "grounded_synthesis": 270, "inconsistent_api_recovery": 352, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 272, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 130, "inconsistent_api_recovery_stateful": 208}, "scenarioActualCalls": {"relevance_detection": 98, "argument_fidelity": 150, "tool_selection": 152, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 241, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 210, "data_gap_recovery_extended": 232, "argument_transformation": 53, "grounded_synthesis": 202, "inconsistent_api_recovery": 482, "relevance_detection_stateful": 95, "argument_fidelity_stateful": 150, "tool_selection_stateful": 151, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 239, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 151, "data_gap_recovery_stateful": 231, "data_gap_recovery_extended_stateful": 183, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 85, "inconsistent_api_recovery_stateful": 294}, "scenarioWastedSum": {"relevance_detection": 48.0, "argument_fidelity": 0.0, "tool_selection": 2.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 47.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 9.0, "argument_transformation": 23.0, "grounded_synthesis": 31.0, "inconsistent_api_recovery": 153.0, "relevance_detection_stateful": 45.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 1.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 45.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 1.0, "data_gap_recovery_stateful": 7.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 6.0, "grounded_synthesis_stateful": 8.0, "inconsistent_api_recovery_stateful": 127.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 39.01, "argument_fidelity": 50.24, "tool_selection": 38.07, "basic_2step": 22.55, "sequential_3step": 49.47, "conditional_routing": 374.61, "sequential_reasoning": 272.23, "error_recovery": 29.9, "data_gap_recovery": 383.21, "data_gap_recovery_extended": 500.45, "argument_transformation": 760.76, "grounded_synthesis": 547.9, "inconsistent_api_recovery": 752.54, "relevance_detection_stateful": 42.36, "argument_fidelity_stateful": 51.4, "tool_selection_stateful": 38.56, "basic_2step_stateful": 26.06, "sequential_3step_stateful": 50.15, "conditional_routing_stateful": 345.91, "sequential_reasoning_stateful": 272.55, "error_recovery_stateful": 30.63, "data_gap_recovery_stateful": 422.76, "data_gap_recovery_extended_stateful": 526.82, "argument_transformation_stateful": 766.07, "grounded_synthesis_stateful": 556.34, "inconsistent_api_recovery_stateful": 803.56}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Instruct-2512-Q8_0 LS/P [reforged]", "model": "Ministral-3-8B-Instruct-2512-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "ministral-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 84.4, "accuracy": 91.1, "completeness": 92.6, "efficiency": 91.8, "wasted": 0.7, "speed": 4.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 6, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 98, "argument_transformation": 8, "grounded_synthesis": 100, "inconsistent_api_recovery": 80, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 4, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 98, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 3, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 4, "grounded_synthesis": 50, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 3, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 3, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 9, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 392, "argument_transformation": 20, "grounded_synthesis": 500, "inconsistent_api_recovery": 320, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 6, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 490, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 15, "basic_2step": 100, "sequential_3step": 151, "conditional_routing": 250, "sequential_reasoning": 354, "error_recovery": 150, "data_gap_recovery": 154, "data_gap_recovery_extended": 331, "argument_transformation": 16, "grounded_synthesis": 606, "inconsistent_api_recovery": 380, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 10, "basic_2step_stateful": 100, "sequential_3step_stateful": 152, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 349, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 154, "data_gap_recovery_extended_stateful": 340, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 598, "inconsistent_api_recovery_stateful": 425}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 6.0, "basic_2step": 0.0, "sequential_3step": 1.0, "conditional_routing": 50.0, "sequential_reasoning": 154.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 106.0, "inconsistent_api_recovery": 84.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 4.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 2.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 149.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 108.0, "inconsistent_api_recovery_stateful": 75.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 3, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 17.18, "argument_fidelity": 65.52, "tool_selection": 7.68, "basic_2step": 29.43, "sequential_3step": 88.75, "conditional_routing": 152.26, "sequential_reasoning": 285.81, "error_recovery": 35.55, "data_gap_recovery": 202.48, "data_gap_recovery_extended": 610.69, "argument_transformation": 298.38, "grounded_synthesis": 722.48, "inconsistent_api_recovery": 267.72, "relevance_detection_stateful": 17.2, "argument_fidelity_stateful": 64.95, "tool_selection_stateful": 5.81, "basic_2step_stateful": 32.91, "sequential_3step_stateful": 91.93, "conditional_routing_stateful": 152.09, "sequential_reasoning_stateful": 276.68, "error_recovery_stateful": 35.54, "data_gap_recovery_stateful": 203.45, "data_gap_recovery_extended_stateful": 609.26, "argument_transformation_stateful": 262.02, "grounded_synthesis_stateful": 760.53, "inconsistent_api_recovery_stateful": 265.43}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 3, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-35B-A3B-Q4_K_M LS/P [reforged]", "model": "Qwen3.5-35B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "qwen3.5-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 82.8, "accuracy": 82.8, "completeness": 100.0, "efficiency": 100.0, "wasted": 0.2, "speed": 10.4, "n": 50, "scenarios": {"relevance_detection": 48, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 98, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 74, "argument_transformation": 16, "grounded_synthesis": 62, "inconsistent_api_recovery": 90, "relevance_detection_stateful": 56, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 68, "argument_transformation_stateful": 14, "grounded_synthesis_stateful": 58, "inconsistent_api_recovery_stateful": 82}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 24, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 37, "argument_transformation": 8, "grounded_synthesis": 31, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 28, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 7, "grounded_synthesis_stateful": 29, "inconsistent_api_recovery_stateful": 41}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 24, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 196, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 296, "argument_transformation": 40, "grounded_synthesis": 310, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 28, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 272, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 290, "inconsistent_api_recovery_stateful": 328}, "scenarioActualCalls": {"relevance_detection": 25, "argument_fidelity": 152, "tool_selection": 201, "basic_2step": 101, "sequential_3step": 150, "conditional_routing": 190, "sequential_reasoning": 197, "error_recovery": 161, "data_gap_recovery": 195, "data_gap_recovery_extended": 154, "argument_transformation": 29, "grounded_synthesis": 183, "inconsistent_api_recovery": 283, "relevance_detection_stateful": 30, "argument_fidelity_stateful": 152, "tool_selection_stateful": 200, "basic_2step_stateful": 109, "sequential_3step_stateful": 152, "conditional_routing_stateful": 198, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 159, "data_gap_recovery_stateful": 211, "data_gap_recovery_extended_stateful": 132, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 146, "inconsistent_api_recovery_stateful": 256}, "scenarioWastedSum": {"relevance_detection": 2.0, "argument_fidelity": 2.0, "tool_selection": 51.0, "basic_2step": 1.0, "sequential_3step": 0.0, "conditional_routing": 28.0, "sequential_reasoning": 1.0, "error_recovery": 61.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 2.0, "grounded_synthesis": 9.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 2.0, "argument_fidelity_stateful": 2.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 9.0, "sequential_3step_stateful": 2.0, "conditional_routing_stateful": 30.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 9.0, "data_gap_recovery_stateful": 5.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 1.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 60.9, "argument_fidelity": 177.92, "tool_selection": 172.67, "basic_2step": 429.56, "sequential_3step": 489.02, "conditional_routing": 476.02, "sequential_reasoning": 320.68, "error_recovery": 596.7, "data_gap_recovery": 444.68, "data_gap_recovery_extended": 542.16, "argument_transformation": 1056.25, "grounded_synthesis": 1059.21, "inconsistent_api_recovery": 1016.89, "relevance_detection_stateful": 56.3, "argument_fidelity_stateful": 283.08, "tool_selection_stateful": 171.91, "basic_2step_stateful": 274.07, "sequential_3step_stateful": 417.92, "conditional_routing_stateful": 488.83, "sequential_reasoning_stateful": 357.79, "error_recovery_stateful": 559.77, "data_gap_recovery_stateful": 440.86, "data_gap_recovery_extended_stateful": 593.31, "argument_transformation_stateful": 1069.76, "grounded_synthesis_stateful": 1053.18, "inconsistent_api_recovery_stateful": 879.17}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "ministral-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 82.8, "accuracy": 82.8, "completeness": 99.9, "efficiency": 95.0, "wasted": 0.5, "speed": 4.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 98, "data_gap_recovery_extended": 66, "argument_transformation": 24, "grounded_synthesis": 34, "inconsistent_api_recovery": 92, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 70, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 42}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 33, "argument_transformation": 12, "grounded_synthesis": 17, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 35, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 15, "inconsistent_api_recovery_stateful": 21}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 245, "data_gap_recovery_extended": 264, "argument_transformation": 60, "grounded_synthesis": 170, "inconsistent_api_recovery": 368, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 280, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 168}, "scenarioActualCalls": {"relevance_detection": 93, "argument_fidelity": 150, "tool_selection": 157, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 241, "sequential_reasoning": 251, "error_recovery": 150, "data_gap_recovery": 222, "data_gap_recovery_extended": 176, "argument_transformation": 57, "grounded_synthesis": 111, "inconsistent_api_recovery": 518, "relevance_detection_stateful": 97, "argument_fidelity_stateful": 150, "tool_selection_stateful": 157, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 230, "sequential_reasoning_stateful": 251, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 181, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 92, "inconsistent_api_recovery_stateful": 254}, "scenarioWastedSum": {"relevance_detection": 43.0, "argument_fidelity": 0.0, "tool_selection": 7.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 49.0, "sequential_reasoning": 51.0, "error_recovery": 50.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 11.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 164.0, "relevance_detection_stateful": 47.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 7.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 45.0, "sequential_reasoning_stateful": 51.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 5.0, "grounded_synthesis_stateful": 8.0, "inconsistent_api_recovery_stateful": 164.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 28.01, "argument_fidelity": 32.22, "tool_selection": 28.8, "basic_2step": 15.06, "sequential_3step": 28.94, "conditional_routing": 298.39, "sequential_reasoning": 218.3, "error_recovery": 17.43, "data_gap_recovery": 255.57, "data_gap_recovery_extended": 383.55, "argument_transformation": 489.01, "grounded_synthesis": 349.36, "inconsistent_api_recovery": 472.38, "relevance_detection_stateful": 30.84, "argument_fidelity_stateful": 34.56, "tool_selection_stateful": 28.66, "basic_2step_stateful": 17.06, "sequential_3step_stateful": 28.27, "conditional_routing_stateful": 294.55, "sequential_reasoning_stateful": 223.12, "error_recovery_stateful": 19.26, "data_gap_recovery_stateful": 258.39, "data_gap_recovery_extended_stateful": 370.46, "argument_transformation_stateful": 478.97, "grounded_synthesis_stateful": 401.76, "inconsistent_api_recovery_stateful": 511.13}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.6-27B-Q4_K_M LS/P [reforged]", "model": "Qwen3.6-27B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "qwen3.6-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 83.5, "accuracy": 85.0, "completeness": 98.2, "efficiency": 97.0, "wasted": 0.4, "speed": 53.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 98, "data_gap_recovery_extended": 6, "argument_transformation": 66, "grounded_synthesis": 52, "inconsistent_api_recovery": 90, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 96, "data_gap_recovery_stateful": 90, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 56, "grounded_synthesis_stateful": 36, "inconsistent_api_recovery_stateful": 80}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 3, "argument_transformation": 33, "grounded_synthesis": 26, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 28, "grounded_synthesis_stateful": 18, "inconsistent_api_recovery_stateful": 40}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 245, "data_gap_recovery_extended": 24, "argument_transformation": 165, "grounded_synthesis": 260, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 140, "grounded_synthesis_stateful": 180, "inconsistent_api_recovery_stateful": 320}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 281, "sequential_reasoning": 200, "error_recovery": 151, "data_gap_recovery": 237, "data_gap_recovery_extended": 12, "argument_transformation": 143, "grounded_synthesis": 251, "inconsistent_api_recovery": 369, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 155, "conditional_routing_stateful": 277, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 145, "data_gap_recovery_stateful": 218, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 113, "grounded_synthesis_stateful": 166, "inconsistent_api_recovery_stateful": 323}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 82.0, "sequential_reasoning": 0.0, "error_recovery": 51.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 6.0, "grounded_synthesis": 66.0, "inconsistent_api_recovery": 52.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 5.0, "conditional_routing_stateful": 81.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 1.0, "data_gap_recovery_stateful": 6.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 3.0, "grounded_synthesis_stateful": 57.0, "inconsistent_api_recovery_stateful": 36.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}, "scenarioSpeedSum": {"relevance_detection": 394.79, "argument_fidelity": 569.94, "tool_selection": 439.43, "basic_2step": 513.59, "sequential_3step": 761.2, "conditional_routing": 2251.22, "sequential_reasoning": 1014.67, "error_recovery": 1982.98, "data_gap_recovery": 2397.0, "data_gap_recovery_extended": 3303.2, "argument_transformation": 6040.23, "grounded_synthesis": 5500.19, "inconsistent_api_recovery": 9925.29, "relevance_detection_stateful": 449.83, "argument_fidelity_stateful": 580.21, "tool_selection_stateful": 449.79, "basic_2step_stateful": 1244.72, "sequential_3step_stateful": 829.13, "conditional_routing_stateful": 2522.43, "sequential_reasoning_stateful": 966.14, "error_recovery_stateful": 1790.76, "data_gap_recovery_stateful": 2280.2, "data_gap_recovery_extended_stateful": 3180.8, "argument_transformation_stateful": 5799.48, "grounded_synthesis_stateful": 5363.13, "inconsistent_api_recovery_stateful": 8271.52}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}}, {"label": "Qwen3.6-35B-A3B-UD-Q4_K_M LS/P [reforged]", "model": "Qwen3.6-35B-A3B-UD-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "qwen3.6-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 82.2, "accuracy": 82.2, "completeness": 100.0, "efficiency": 100.0, "wasted": 0.3, "speed": 23.6, "n": 50, "scenarios": {"relevance_detection": 96, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 90, "sequential_reasoning": 92, "error_recovery": 98, "data_gap_recovery": 92, "data_gap_recovery_extended": 16, "argument_transformation": 46, "grounded_synthesis": 62, "inconsistent_api_recovery": 98, "relevance_detection_stateful": 88, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 88, "sequential_reasoning_stateful": 94, "error_recovery_stateful": 96, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 94}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 48, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 45, "sequential_reasoning": 46, "error_recovery": 49, "data_gap_recovery": 46, "data_gap_recovery_extended": 8, "argument_transformation": 23, "grounded_synthesis": 31, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 47}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 48, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 180, "sequential_reasoning": 184, "error_recovery": 98, "data_gap_recovery": 230, "data_gap_recovery_extended": 64, "argument_transformation": 115, "grounded_synthesis": 310, "inconsistent_api_recovery": 392, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 176, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 105, "grounded_synthesis_stateful": 250, "inconsistent_api_recovery_stateful": 376}, "scenarioActualCalls": {"relevance_detection": 48, "argument_fidelity": 150, "tool_selection": 189, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 174, "sequential_reasoning": 184, "error_recovery": 150, "data_gap_recovery": 186, "data_gap_recovery_extended": 32, "argument_transformation": 84, "grounded_synthesis": 160, "inconsistent_api_recovery": 442, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 150, "tool_selection_stateful": 176, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 187, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 166, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 81, "grounded_synthesis_stateful": 120, "inconsistent_api_recovery_stateful": 398}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 39.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 28.0, "sequential_reasoning": 0.0, "error_recovery": 54.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 78.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 26.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 40.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 4.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 1.0, "inconsistent_api_recovery_stateful": 56.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 421.31, "argument_fidelity": 311.17, "tool_selection": 533.83, "basic_2step": 405.51, "sequential_3step": 506.4, "conditional_routing": 647.68, "sequential_reasoning": 598.51, "error_recovery": 1819.19, "data_gap_recovery": 773.69, "data_gap_recovery_extended": 902.36, "argument_transformation": 2146.81, "grounded_synthesis": 2281.21, "inconsistent_api_recovery": 3795.79, "relevance_detection_stateful": 473.26, "argument_fidelity_stateful": 285.67, "tool_selection_stateful": 402.3, "basic_2step_stateful": 420.15, "sequential_3step_stateful": 413.66, "conditional_routing_stateful": 683.53, "sequential_reasoning_stateful": 432.88, "error_recovery_stateful": 2249.88, "data_gap_recovery_stateful": 816.62, "data_gap_recovery_extended_stateful": 1018.74, "argument_transformation_stateful": 2224.15, "grounded_synthesis_stateful": 2295.51, "inconsistent_api_recovery_stateful": 3800.19}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Instruct-2512-Q8_0 LS/N [reforged]", "model": "Ministral-3-8B-Instruct-2512-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "ministral-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 81.4, "accuracy": 81.4, "completeness": 100.0, "efficiency": 99.8, "wasted": 0.3, "speed": 4.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 38, "argument_transformation": 4, "grounded_synthesis": 0, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 74, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 19, "argument_transformation": 2, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 37, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 152, "argument_transformation": 10, "grounded_synthesis": 0, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 296, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 200, "basic_2step": 150, "sequential_3step": 150, "conditional_routing": 150, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 221, "data_gap_recovery_extended": 96, "argument_transformation": 6, "grounded_synthesis": 0, "inconsistent_api_recovery": 350, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 200, "basic_2step_stateful": 150, "sequential_3step_stateful": 150, "conditional_routing_stateful": 150, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 209, "data_gap_recovery_extended_stateful": 183, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 350}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 50.0, "basic_2step": 50.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 27.75, "argument_fidelity": 59.91, "tool_selection": 55.07, "basic_2step": 36.12, "sequential_3step": 37.56, "conditional_routing": 143.18, "sequential_reasoning": 145.03, "error_recovery": 196.3, "data_gap_recovery": 293.16, "data_gap_recovery_extended": 462.77, "argument_transformation": 579.58, "grounded_synthesis": 304.52, "inconsistent_api_recovery": 234.97, "relevance_detection_stateful": 27.15, "argument_fidelity_stateful": 59.61, "tool_selection_stateful": 55.04, "basic_2step_stateful": 32.04, "sequential_3step_stateful": 37.56, "conditional_routing_stateful": 169.4, "sequential_reasoning_stateful": 189.44, "error_recovery_stateful": 196.54, "data_gap_recovery_stateful": 284.0, "data_gap_recovery_extended_stateful": 433.7, "argument_transformation_stateful": 575.81, "grounded_synthesis_stateful": 296.56, "inconsistent_api_recovery_stateful": 234.81}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "ministral-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 81.3, "accuracy": 85.0, "completeness": 95.7, "efficiency": 96.1, "wasted": 0.7, "speed": 3.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 96, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 98, "sequential_reasoning": 100, "error_recovery": 92, "data_gap_recovery": 98, "data_gap_recovery_extended": 56, "argument_transformation": 14, "grounded_synthesis": 46, "inconsistent_api_recovery": 70, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 88, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 94, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 82, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 34}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 46, "data_gap_recovery": 49, "data_gap_recovery_extended": 28, "argument_transformation": 7, "grounded_synthesis": 23, "inconsistent_api_recovery": 35, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 41, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 17}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 47, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 28, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 47, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 28, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 144, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 196, "sequential_reasoning": 200, "error_recovery": 92, "data_gap_recovery": 245, "data_gap_recovery_extended": 224, "argument_transformation": 35, "grounded_synthesis": 230, "inconsistent_api_recovery": 280, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 132, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 141, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 328, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 240, "inconsistent_api_recovery_stateful": 136}, "scenarioActualCalls": {"relevance_detection": 81, "argument_fidelity": 160, "tool_selection": 193, "basic_2step": 100, "sequential_3step": 151, "conditional_routing": 242, "sequential_reasoning": 200, "error_recovery": 138, "data_gap_recovery": 212, "data_gap_recovery_extended": 153, "argument_transformation": 47, "grounded_synthesis": 184, "inconsistent_api_recovery": 379, "relevance_detection_stateful": 74, "argument_fidelity_stateful": 151, "tool_selection_stateful": 165, "basic_2step_stateful": 100, "sequential_3step_stateful": 152, "conditional_routing_stateful": 244, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 141, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 261, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 196, "inconsistent_api_recovery_stateful": 193}, "scenarioWastedSum": {"relevance_detection": 31.0, "argument_fidelity": 10.0, "tool_selection": 49.0, "basic_2step": 0.0, "sequential_3step": 1.0, "conditional_routing": 47.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 80.0, "grounded_synthesis": 54.0, "inconsistent_api_recovery": 137.0, "relevance_detection_stateful": 24.0, "argument_fidelity_stateful": 1.0, "tool_selection_stateful": 33.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 2.0, "conditional_routing_stateful": 49.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 5.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 58.0, "grounded_synthesis_stateful": 56.0, "inconsistent_api_recovery_stateful": 155.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 47, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 28, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 38.92, "argument_fidelity": 66.87, "tool_selection": 70.67, "basic_2step": 29.04, "sequential_3step": 62.73, "conditional_routing": 168.96, "sequential_reasoning": 78.96, "error_recovery": 35.44, "data_gap_recovery": 206.94, "data_gap_recovery_extended": 454.94, "argument_transformation": 227.5, "grounded_synthesis": 589.95, "inconsistent_api_recovery": 345.05, "relevance_detection_stateful": 31.27, "argument_fidelity_stateful": 57.55, "tool_selection_stateful": 63.82, "basic_2step_stateful": 32.96, "sequential_3step_stateful": 65.29, "conditional_routing_stateful": 170.38, "sequential_reasoning_stateful": 77.26, "error_recovery_stateful": 35.41, "data_gap_recovery_stateful": 215.54, "data_gap_recovery_extended_stateful": 449.19, "argument_transformation_stateful": 228.16, "grounded_synthesis_stateful": 743.01, "inconsistent_api_recovery_stateful": 328.75}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 47, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 28, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-sonnet-4-6 AN/N [bare+any]", "model": "claude-sonnet-4-6", "backend": "anthropic", "mode": "native", "ablation": "bare", "family": "claude", "quant": "n/a", "gen": 1, "retired": false, "score": 81.5, "accuracy": 88.2, "completeness": 92.3, "efficiency": 100.0, "wasted": 0.0, "speed": 11.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 100, "argument_transformation": 12, "grounded_synthesis": 100, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 16, "grounded_synthesis_stateful": 90, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 6, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 400, "argument_transformation": 30, "grounded_synthesis": 500, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 40, "grounded_synthesis_stateful": 450, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 100, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 198, "argument_transformation": 24, "grounded_synthesis": 250, "inconsistent_api_recovery": 200, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 198, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 225, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 145.14, "argument_fidelity": 325.1, "tool_selection": 358.59, "basic_2step": 211.29, "sequential_3step": 316.7, "conditional_routing": 573.63, "sequential_reasoning": 415.37, "error_recovery": 0.0, "data_gap_recovery": 650.93, "data_gap_recovery_extended": 787.28, "argument_transformation": 650.86, "grounded_synthesis": 1426.01, "inconsistent_api_recovery": 708.65, "relevance_detection_stateful": 182.37, "argument_fidelity_stateful": 449.82, "tool_selection_stateful": 371.42, "basic_2step_stateful": 210.18, "sequential_3step_stateful": 761.07, "conditional_routing_stateful": 550.97, "sequential_reasoning_stateful": 537.42, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 633.0, "data_gap_recovery_extended_stateful": 933.1, "argument_transformation_stateful": 573.31, "grounded_synthesis_stateful": 1470.15, "inconsistent_api_recovery_stateful": 714.47}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [reforged]", "model": "Ministral-3-14B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "ministral-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 80.2, "accuracy": 80.2, "completeness": 100.0, "efficiency": 100.0, "wasted": 0.0, "speed": 2.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 36, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 18, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 180, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 250, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 158, "sequential_reasoning": 200, "error_recovery": 150, "data_gap_recovery": 150, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 118, "inconsistent_api_recovery": 227, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 151, "conditional_routing_stateful": 158, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 160, "inconsistent_api_recovery_stateful": 224}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 4.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 1.0, "conditional_routing_stateful": 4.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 16.74, "argument_fidelity": 59.09, "tool_selection": 52.27, "basic_2step": 24.06, "sequential_3step": 69.28, "conditional_routing": 179.82, "sequential_reasoning": 106.43, "error_recovery": 37.96, "data_gap_recovery": 150.18, "data_gap_recovery_extended": 178.67, "argument_transformation": 207.3, "grounded_synthesis": 504.27, "inconsistent_api_recovery": 309.32, "relevance_detection_stateful": 16.75, "argument_fidelity_stateful": 59.27, "tool_selection_stateful": 52.27, "basic_2step_stateful": 24.07, "sequential_3step_stateful": 68.62, "conditional_routing_stateful": 179.52, "sequential_reasoning_stateful": 108.79, "error_recovery_stateful": 37.96, "data_gap_recovery_stateful": 150.51, "data_gap_recovery_extended_stateful": 178.61, "argument_transformation_stateful": 207.89, "grounded_synthesis_stateful": 487.44, "inconsistent_api_recovery_stateful": 308.01}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "ministral-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 80.5, "accuracy": 83.0, "completeness": 97.0, "efficiency": 95.7, "wasted": 0.7, "speed": 2.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 98, "tool_selection": 98, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 98, "sequential_reasoning": 98, "error_recovery": 100, "data_gap_recovery": 96, "data_gap_recovery_extended": 74, "argument_transformation": 10, "grounded_synthesis": 36, "inconsistent_api_recovery": 70, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 98, "tool_selection_stateful": 96, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 98, "data_gap_recovery_stateful": 94, "data_gap_recovery_extended_stateful": 70, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 20}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 37, "argument_transformation": 5, "grounded_synthesis": 18, "inconsistent_api_recovery": 35, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 35, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 19, "inconsistent_api_recovery_stateful": 10}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 38, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 31, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 38, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 31, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 147, "tool_selection": 147, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 196, "sequential_reasoning": 196, "error_recovery": 100, "data_gap_recovery": 240, "data_gap_recovery_extended": 296, "argument_transformation": 25, "grounded_synthesis": 180, "inconsistent_api_recovery": 280, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 144, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 280, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 190, "inconsistent_api_recovery_stateful": 80}, "scenarioActualCalls": {"relevance_detection": 73, "argument_fidelity": 164, "tool_selection": 192, "basic_2step": 100, "sequential_3step": 152, "conditional_routing": 230, "sequential_reasoning": 196, "error_recovery": 150, "data_gap_recovery": 217, "data_gap_recovery_extended": 221, "argument_transformation": 32, "grounded_synthesis": 165, "inconsistent_api_recovery": 386, "relevance_detection_stateful": 67, "argument_fidelity_stateful": 167, "tool_selection_stateful": 180, "basic_2step_stateful": 100, "sequential_3step_stateful": 154, "conditional_routing_stateful": 231, "sequential_reasoning_stateful": 201, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 211, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 144, "inconsistent_api_recovery_stateful": 106}, "scenarioWastedSum": {"relevance_detection": 23.0, "argument_fidelity": 17.0, "tool_selection": 45.0, "basic_2step": 0.0, "sequential_3step": 2.0, "conditional_routing": 41.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 6.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 55.0, "grounded_synthesis": 67.0, "inconsistent_api_recovery": 158.0, "relevance_detection_stateful": 17.0, "argument_fidelity_stateful": 20.0, "tool_selection_stateful": 36.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 4.0, "conditional_routing_stateful": 42.0, "sequential_reasoning_stateful": 1.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 14.0, "data_gap_recovery_extended_stateful": 3.0, "argument_transformation_stateful": 63.0, "grounded_synthesis_stateful": 56.0, "inconsistent_api_recovery_stateful": 174.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 38, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 31, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 24.29, "argument_fidelity": 45.81, "tool_selection": 62.12, "basic_2step": 19.04, "sequential_3step": 44.18, "conditional_routing": 113.1, "sequential_reasoning": 49.27, "error_recovery": 23.45, "data_gap_recovery": 170.13, "data_gap_recovery_extended": 269.65, "argument_transformation": 219.2, "grounded_synthesis": 445.79, "inconsistent_api_recovery": 230.71, "relevance_detection_stateful": 19.97, "argument_fidelity_stateful": 49.62, "tool_selection_stateful": 53.16, "basic_2step_stateful": 21.48, "sequential_3step_stateful": 40.54, "conditional_routing_stateful": 109.73, "sequential_reasoning_stateful": 49.03, "error_recovery_stateful": 23.47, "data_gap_recovery_stateful": 169.3, "data_gap_recovery_extended_stateful": 289.88, "argument_transformation_stateful": 155.92, "grounded_synthesis_stateful": 449.54, "inconsistent_api_recovery_stateful": 218.06}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 38, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 31, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "qwen3:14b-q4_K_M OL/N [reforged]", "model": "qwen3:14b-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 78.6, "accuracy": 78.7, "completeness": 99.9, "efficiency": 76.7, "wasted": 1.2, "speed": 38.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 74, "data_gap_recovery_extended": 4, "argument_transformation": 12, "grounded_synthesis": 68, "inconsistent_api_recovery": 78, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 94, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 54, "inconsistent_api_recovery_stateful": 68}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 37, "data_gap_recovery_extended": 2, "argument_transformation": 6, "grounded_synthesis": 34, "inconsistent_api_recovery": 39, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 34}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 185, "data_gap_recovery_extended": 16, "argument_transformation": 30, "grounded_synthesis": 340, "inconsistent_api_recovery": 312, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 141, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 270, "inconsistent_api_recovery_stateful": 272}, "scenarioActualCalls": {"relevance_detection": 60, "argument_fidelity": 155, "tool_selection": 198, "basic_2step": 203, "sequential_3step": 153, "conditional_routing": 268, "sequential_reasoning": 200, "error_recovery": 158, "data_gap_recovery": 182, "data_gap_recovery_extended": 14, "argument_transformation": 24, "grounded_synthesis": 560, "inconsistent_api_recovery": 464, "relevance_detection_stateful": 57, "argument_fidelity_stateful": 152, "tool_selection_stateful": 200, "basic_2step_stateful": 153, "sequential_3step_stateful": 154, "conditional_routing_stateful": 270, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 152, "data_gap_recovery_stateful": 219, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 460, "inconsistent_api_recovery_stateful": 415}, "scenarioWastedSum": {"relevance_detection": 10.0, "argument_fidelity": 5.0, "tool_selection": 48.0, "basic_2step": 103.0, "sequential_3step": 3.0, "conditional_routing": 68.0, "sequential_reasoning": 0.0, "error_recovery": 58.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 2.0, "grounded_synthesis": 317.0, "inconsistent_api_recovery": 193.0, "relevance_detection_stateful": 7.0, "argument_fidelity_stateful": 2.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 53.0, "sequential_3step_stateful": 4.0, "conditional_routing_stateful": 70.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 14.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 312.0, "inconsistent_api_recovery_stateful": 189.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 132.08, "argument_fidelity": 415.71, "tool_selection": 591.32, "basic_2step": 944.65, "sequential_3step": 806.26, "conditional_routing": 1491.25, "sequential_reasoning": 783.04, "error_recovery": 650.24, "data_gap_recovery": 1061.72, "data_gap_recovery_extended": 1733.48, "argument_transformation": 3533.73, "grounded_synthesis": 7452.35, "inconsistent_api_recovery": 5646.86, "relevance_detection_stateful": 120.01, "argument_fidelity_stateful": 406.89, "tool_selection_stateful": 647.89, "basic_2step_stateful": 634.11, "sequential_3step_stateful": 821.99, "conditional_routing_stateful": 1621.36, "sequential_reasoning_stateful": 874.04, "error_recovery_stateful": 663.65, "data_gap_recovery_stateful": 1232.61, "data_gap_recovery_extended_stateful": 1850.08, "argument_transformation_stateful": 3127.15, "grounded_synthesis_stateful": 7415.91, "inconsistent_api_recovery_stateful": 5416.09}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "ministral-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 79.5, "accuracy": 80.5, "completeness": 98.7, "efficiency": 96.5, "wasted": 0.5, "speed": 3.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 82, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 78, "data_gap_recovery_extended": 30, "argument_transformation": 6, "grounded_synthesis": 58, "inconsistent_api_recovery": 92, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 74, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 80, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 56, "inconsistent_api_recovery_stateful": 84}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 41, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 39, "data_gap_recovery_extended": 15, "argument_transformation": 3, "grounded_synthesis": 29, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 37, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 42}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 48, "grounded_synthesis": 47, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 46}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 48, "grounded_synthesis": 47, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 46}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 164, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 195, "data_gap_recovery_extended": 120, "argument_transformation": 15, "grounded_synthesis": 290, "inconsistent_api_recovery": 368, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 148, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 200, "data_gap_recovery_extended_stateful": 80, "argument_transformation_stateful": 15, "grounded_synthesis_stateful": 280, "inconsistent_api_recovery_stateful": 336}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 193, "sequential_reasoning": 202, "error_recovery": 150, "data_gap_recovery": 167, "data_gap_recovery_extended": 62, "argument_transformation": 16, "grounded_synthesis": 197, "inconsistent_api_recovery": 556, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 151, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 177, "sequential_reasoning_stateful": 204, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 176, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 15, "grounded_synthesis_stateful": 190, "inconsistent_api_recovery_stateful": 516}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 34.0, "sequential_reasoning": 2.0, "error_recovery": 50.0, "data_gap_recovery": 10.0, "data_gap_recovery_extended": 4.0, "argument_transformation": 42.0, "grounded_synthesis": 20.0, "inconsistent_api_recovery": 192.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 1.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 33.0, "sequential_reasoning_stateful": 4.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 15.0, "data_gap_recovery_extended_stateful": 16.0, "argument_transformation_stateful": 37.0, "grounded_synthesis_stateful": 12.0, "inconsistent_api_recovery_stateful": 209.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 48, "grounded_synthesis": 47, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 46}, "scenarioSpeedSum": {"relevance_detection": 17.28, "argument_fidelity": 57.44, "tool_selection": 49.32, "basic_2step": 29.79, "sequential_3step": 61.71, "conditional_routing": 176.74, "sequential_reasoning": 86.82, "error_recovery": 38.44, "data_gap_recovery": 219.46, "data_gap_recovery_extended": 360.22, "argument_transformation": 396.91, "grounded_synthesis": 614.33, "inconsistent_api_recovery": 293.83, "relevance_detection_stateful": 16.94, "argument_fidelity_stateful": 58.78, "tool_selection_stateful": 51.38, "basic_2step_stateful": 33.46, "sequential_3step_stateful": 62.76, "conditional_routing_stateful": 186.49, "sequential_reasoning_stateful": 91.25, "error_recovery_stateful": 38.69, "data_gap_recovery_stateful": 179.6, "data_gap_recovery_extended_stateful": 325.72, "argument_transformation_stateful": 435.52, "grounded_synthesis_stateful": 538.04, "inconsistent_api_recovery_stateful": 305.34}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 48, "grounded_synthesis": 47, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 46}}, {"label": "Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [reforged]", "model": "Ministral-3-14B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "ministral-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 78.1, "accuracy": 78.1, "completeness": 100.0, "efficiency": 97.2, "wasted": 0.3, "speed": 4.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 16, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 64, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 56, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 200, "basic_2step": 150, "sequential_3step": 150, "conditional_routing": 151, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 200, "data_gap_recovery_extended": 40, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 350, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 200, "basic_2step_stateful": 150, "sequential_3step_stateful": 150, "conditional_routing_stateful": 150, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 200, "data_gap_recovery_extended_stateful": 35, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 350}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 50.0, "basic_2step": 50.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 29.4, "argument_fidelity": 51.04, "tool_selection": 48.45, "basic_2step": 23.57, "sequential_3step": 37.06, "conditional_routing": 151.37, "sequential_reasoning": 196.42, "error_recovery": 25.08, "data_gap_recovery": 308.69, "data_gap_recovery_extended": 347.58, "argument_transformation": 367.21, "grounded_synthesis": 796.04, "inconsistent_api_recovery": 231.38, "relevance_detection_stateful": 29.26, "argument_fidelity_stateful": 51.97, "tool_selection_stateful": 48.45, "basic_2step_stateful": 23.6, "sequential_3step_stateful": 37.17, "conditional_routing_stateful": 152.54, "sequential_reasoning_stateful": 179.86, "error_recovery_stateful": 25.07, "data_gap_recovery_stateful": 320.4, "data_gap_recovery_extended_stateful": 353.46, "argument_transformation_stateful": 373.44, "grounded_synthesis_stateful": 775.69, "inconsistent_api_recovery_stateful": 231.11}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [reforged]", "model": "Ministral-3-8B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "ministral-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 78.3, "accuracy": 78.4, "completeness": 99.8, "efficiency": 95.3, "wasted": 0.4, "speed": 3.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 98, "data_gap_recovery": 100, "data_gap_recovery_extended": 22, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 98, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 11, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 98, "data_gap_recovery": 250, "data_gap_recovery_extended": 88, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 56, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 200, "basic_2step": 150, "sequential_3step": 150, "conditional_routing": 150, "sequential_reasoning": 250, "error_recovery": 147, "data_gap_recovery": 241, "data_gap_recovery_extended": 55, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 350, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 200, "basic_2step_stateful": 150, "sequential_3step_stateful": 150, "conditional_routing_stateful": 150, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 35, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 350}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 50.0, "basic_2step": 50.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 50.0, "error_recovery": 49.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 9.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 12.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 16.75, "argument_fidelity": 35.27, "tool_selection": 36.48, "basic_2step": 19.55, "sequential_3step": 61.83, "conditional_routing": 101.8, "sequential_reasoning": 90.59, "error_recovery": 182.93, "data_gap_recovery": 199.5, "data_gap_recovery_extended": 279.31, "argument_transformation": 434.73, "grounded_synthesis": 441.42, "inconsistent_api_recovery": 156.32, "relevance_detection_stateful": 16.77, "argument_fidelity_stateful": 36.28, "tool_selection_stateful": 36.45, "basic_2step_stateful": 19.57, "sequential_3step_stateful": 60.67, "conditional_routing_stateful": 103.79, "sequential_reasoning_stateful": 107.83, "error_recovery_stateful": 188.94, "data_gap_recovery_stateful": 198.23, "data_gap_recovery_extended_stateful": 285.43, "argument_transformation_stateful": 433.95, "grounded_synthesis_stateful": 436.1, "inconsistent_api_recovery_stateful": 156.06}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/N [reforged]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 78.2, "accuracy": 82.2, "completeness": 95.1, "efficiency": 98.3, "wasted": 0.5, "speed": 9.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 92, "sequential_reasoning": 98, "error_recovery": 98, "data_gap_recovery": 90, "data_gap_recovery_extended": 0, "argument_transformation": 24, "grounded_synthesis": 80, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 94, "sequential_reasoning_stateful": 90, "error_recovery_stateful": 94, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 84, "inconsistent_api_recovery_stateful": 40}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 46, "sequential_reasoning": 49, "error_recovery": 49, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 12, "grounded_synthesis": 40, "inconsistent_api_recovery": 25, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 45, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 20}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 45, "inconsistent_api_recovery": 28, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 25}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 45, "inconsistent_api_recovery": 28, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 25}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 184, "sequential_reasoning": 196, "error_recovery": 98, "data_gap_recovery": 225, "data_gap_recovery_extended": 0, "argument_transformation": 60, "grounded_synthesis": 400, "inconsistent_api_recovery": 200, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 188, "sequential_reasoning_stateful": 180, "error_recovery_stateful": 141, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 420, "inconsistent_api_recovery_stateful": 160}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 154, "tool_selection": 165, "basic_2step": 100, "sequential_3step": 158, "conditional_routing": 187, "sequential_reasoning": 213, "error_recovery": 208, "data_gap_recovery": 236, "data_gap_recovery_extended": 0, "argument_transformation": 45, "grounded_synthesis": 214, "inconsistent_api_recovery": 289, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 156, "tool_selection_stateful": 169, "basic_2step_stateful": 101, "sequential_3step_stateful": 168, "conditional_routing_stateful": 206, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 180, "data_gap_recovery_stateful": 241, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 260, "inconsistent_api_recovery_stateful": 226}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 4.0, "tool_selection": 15.0, "basic_2step": 0.0, "sequential_3step": 8.0, "conditional_routing": 19.0, "sequential_reasoning": 17.0, "error_recovery": 110.0, "data_gap_recovery": 28.0, "data_gap_recovery_extended": 22.0, "argument_transformation": 1.0, "grounded_synthesis": 35.0, "inconsistent_api_recovery": 102.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 6.0, "tool_selection_stateful": 19.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 18.0, "conditional_routing_stateful": 27.0, "sequential_reasoning_stateful": 8.0, "error_recovery_stateful": 39.0, "data_gap_recovery_stateful": 19.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 2.0, "grounded_synthesis_stateful": 54.0, "inconsistent_api_recovery_stateful": 87.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 45, "inconsistent_api_recovery": 28, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 25}, "scenarioSpeedSum": {"relevance_detection": 80.37, "argument_fidelity": 153.17, "tool_selection": 161.44, "basic_2step": 69.53, "sequential_3step": 150.51, "conditional_routing": 441.13, "sequential_reasoning": 319.06, "error_recovery": 249.48, "data_gap_recovery": 716.33, "data_gap_recovery_extended": 728.91, "argument_transformation": 819.93, "grounded_synthesis": 769.22, "inconsistent_api_recovery": 970.52, "relevance_detection_stateful": 82.01, "argument_fidelity_stateful": 168.72, "tool_selection_stateful": 168.51, "basic_2step_stateful": 72.59, "sequential_3step_stateful": 160.48, "conditional_routing_stateful": 482.09, "sequential_reasoning_stateful": 313.61, "error_recovery_stateful": 236.22, "data_gap_recovery_stateful": 681.84, "data_gap_recovery_extended_stateful": 659.16, "argument_transformation_stateful": 919.46, "grounded_synthesis_stateful": 848.4, "inconsistent_api_recovery_stateful": 760.2}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 45, "inconsistent_api_recovery": 28, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 25}}, {"label": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [reforged]", "model": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "mistral-small-3.2", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 78.2, "accuracy": 84.3, "completeness": 92.7, "efficiency": 77.6, "wasted": 1.1, "speed": 3.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 28, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 100, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 34, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 76}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 14, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 38}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 2, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 2, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 70, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 500, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 85, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 304}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 150, "data_gap_recovery": 70, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 887, "inconsistent_api_recovery": 484, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 900, "inconsistent_api_recovery_stateful": 434}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 59.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 18.0, "grounded_synthesis": 392.0, "inconsistent_api_recovery": 139.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 31.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 48.0, "grounded_synthesis_stateful": 400.0, "inconsistent_api_recovery_stateful": 136.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 2, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 31.51, "argument_fidelity": 100.79, "tool_selection": 72.96, "basic_2step": 42.69, "sequential_3step": 106.2, "conditional_routing": 191.77, "sequential_reasoning": 114.42, "error_recovery": 58.31, "data_gap_recovery": 233.12, "data_gap_recovery_extended": 202.28, "argument_transformation": 16.3, "grounded_synthesis": 529.12, "inconsistent_api_recovery": 448.89, "relevance_detection_stateful": 30.6, "argument_fidelity_stateful": 100.84, "tool_selection_stateful": 72.56, "basic_2step_stateful": 62.05, "sequential_3step_stateful": 102.57, "conditional_routing_stateful": 198.68, "sequential_reasoning_stateful": 113.34, "error_recovery_stateful": 58.26, "data_gap_recovery_stateful": 192.37, "data_gap_recovery_extended_stateful": 205.83, "argument_transformation_stateful": 47.2, "grounded_synthesis_stateful": 525.85, "inconsistent_api_recovery_stateful": 454.37}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 2, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "gemma-4-E4B-it-Q8_0 LS/N [reforged]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "gemma4-e4b", "quant": "q8_0", "gen": 2, "retired": false, "score": 76.2, "accuracy": 80.7, "completeness": 94.5, "efficiency": 98.0, "wasted": 0.6, "speed": 12.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 84, "sequential_reasoning": 88, "error_recovery": 90, "data_gap_recovery": 96, "data_gap_recovery_extended": 2, "argument_transformation": 14, "grounded_synthesis": 80, "inconsistent_api_recovery": 44, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 88, "sequential_reasoning_stateful": 90, "error_recovery_stateful": 96, "data_gap_recovery_stateful": 94, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 80, "inconsistent_api_recovery_stateful": 32}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 42, "sequential_reasoning": 44, "error_recovery": 45, "data_gap_recovery": 48, "data_gap_recovery_extended": 1, "argument_transformation": 7, "grounded_synthesis": 40, "inconsistent_api_recovery": 22, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 45, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 16}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 45, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 19}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 45, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 19}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 168, "sequential_reasoning": 176, "error_recovery": 90, "data_gap_recovery": 240, "data_gap_recovery_extended": 8, "argument_transformation": 35, "grounded_synthesis": 400, "inconsistent_api_recovery": 176, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 176, "sequential_reasoning_stateful": 180, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 400, "inconsistent_api_recovery_stateful": 128}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 159, "tool_selection": 166, "basic_2step": 100, "sequential_3step": 164, "conditional_routing": 185, "sequential_reasoning": 189, "error_recovery": 202, "data_gap_recovery": 253, "data_gap_recovery_extended": 6, "argument_transformation": 26, "grounded_synthesis": 168, "inconsistent_api_recovery": 271, "relevance_detection_stateful": 51, "argument_fidelity_stateful": 157, "tool_selection_stateful": 174, "basic_2step_stateful": 100, "sequential_3step_stateful": 167, "conditional_routing_stateful": 197, "sequential_reasoning_stateful": 203, "error_recovery_stateful": 207, "data_gap_recovery_stateful": 267, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 201, "inconsistent_api_recovery_stateful": 176}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 9.0, "tool_selection": 16.0, "basic_2step": 0.0, "sequential_3step": 14.0, "conditional_routing": 25.0, "sequential_reasoning": 14.0, "error_recovery": 112.0, "data_gap_recovery": 34.0, "data_gap_recovery_extended": 12.0, "argument_transformation": 1.0, "grounded_synthesis": 7.0, "inconsistent_api_recovery": 106.0, "relevance_detection_stateful": 1.0, "argument_fidelity_stateful": 7.0, "tool_selection_stateful": 24.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 17.0, "conditional_routing_stateful": 31.0, "sequential_reasoning_stateful": 26.0, "error_recovery_stateful": 63.0, "data_gap_recovery_stateful": 51.0, "data_gap_recovery_extended_stateful": 16.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 38.0, "inconsistent_api_recovery_stateful": 59.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 45, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 19}, "scenarioSpeedSum": {"relevance_detection": 108.68, "argument_fidelity": 227.79, "tool_selection": 240.06, "basic_2step": 87.06, "sequential_3step": 258.19, "conditional_routing": 659.61, "sequential_reasoning": 412.54, "error_recovery": 374.2, "data_gap_recovery": 1079.5, "data_gap_recovery_extended": 1121.63, "argument_transformation": 1156.46, "grounded_synthesis": 1014.91, "inconsistent_api_recovery": 1181.83, "relevance_detection_stateful": 117.54, "argument_fidelity_stateful": 243.21, "tool_selection_stateful": 242.61, "basic_2step_stateful": 92.22, "sequential_3step_stateful": 237.57, "conditional_routing_stateful": 684.94, "sequential_reasoning_stateful": 454.95, "error_recovery_stateful": 397.0, "data_gap_recovery_stateful": 1103.78, "data_gap_recovery_extended_stateful": 1175.07, "argument_transformation_stateful": 1038.75, "grounded_synthesis_stateful": 1185.51, "inconsistent_api_recovery_stateful": 827.99}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 45, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 19}}, {"label": "Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [reforged]", "model": "Ministral-3-8B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "ministral-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 75.6, "accuracy": 83.8, "completeness": 90.2, "efficiency": 79.0, "wasted": 1.3, "speed": 3.0, "n": 50, "scenarios": {"relevance_detection": 98, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 22, "argument_transformation": 12, "grounded_synthesis": 56, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 28, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 11, "argument_transformation": 6, "grounded_synthesis": 28, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 37, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 37, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 88, "argument_transformation": 30, "grounded_synthesis": 280, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 112, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 250, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 146, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 344, "error_recovery": 150, "data_gap_recovery": 150, "data_gap_recovery_extended": 67, "argument_transformation": 27, "grounded_synthesis": 378, "inconsistent_api_recovery": 650, "relevance_detection_stateful": 144, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 335, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 90, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 343, "inconsistent_api_recovery_stateful": 649}, "scenarioWastedSum": {"relevance_detection": 97.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 144.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 37.0, "grounded_synthesis": 160.0, "inconsistent_api_recovery": 250.0, "relevance_detection_stateful": 94.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 135.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 25.0, "grounded_synthesis_stateful": 189.0, "inconsistent_api_recovery_stateful": 249.0}, "scenarioWastedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 37, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 34.24, "argument_fidelity": 38.12, "tool_selection": 0.0, "basic_2step": 19.87, "sequential_3step": 90.61, "conditional_routing": 114.26, "sequential_reasoning": 188.49, "error_recovery": 23.47, "data_gap_recovery": 121.57, "data_gap_recovery_extended": 324.6, "argument_transformation": 311.26, "grounded_synthesis": 234.77, "inconsistent_api_recovery": 222.33, "relevance_detection_stateful": 32.76, "argument_fidelity_stateful": 37.51, "tool_selection_stateful": 0.0, "basic_2step_stateful": 21.47, "sequential_3step_stateful": 92.57, "conditional_routing_stateful": 111.14, "sequential_reasoning_stateful": 177.95, "error_recovery_stateful": 23.38, "data_gap_recovery_stateful": 119.95, "data_gap_recovery_extended_stateful": 343.9, "argument_transformation_stateful": 327.52, "grounded_synthesis_stateful": 269.65, "inconsistent_api_recovery_stateful": 222.55}, "scenarioSpeedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 37, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q8_0 LS/P [reforged]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "gemma4-e4b", "quant": "q8_0", "gen": 2, "retired": false, "score": 74.7, "accuracy": 74.7, "completeness": 100.0, "efficiency": 84.6, "wasted": 0.6, "speed": 12.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 70, "sequential_reasoning": 100, "error_recovery": 90, "data_gap_recovery": 88, "data_gap_recovery_extended": 0, "argument_transformation": 16, "grounded_synthesis": 34, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 98, "data_gap_recovery_stateful": 84, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 90}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 35, "sequential_reasoning": 50, "error_recovery": 45, "data_gap_recovery": 44, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 17, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 24, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 15, "inconsistent_api_recovery_stateful": 45}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 140, "sequential_reasoning": 200, "error_recovery": 90, "data_gap_recovery": 220, "data_gap_recovery_extended": 0, "argument_transformation": 40, "grounded_synthesis": 170, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 210, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 360}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 197, "sequential_reasoning": 200, "error_recovery": 142, "data_gap_recovery": 251, "data_gap_recovery_extended": 0, "argument_transformation": 33, "grounded_synthesis": 190, "inconsistent_api_recovery": 574, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 143, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 241, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 174, "inconsistent_api_recovery_stateful": 558}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 63.0, "sequential_reasoning": 0.0, "error_recovery": 58.0, "data_gap_recovery": 36.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 28.0, "inconsistent_api_recovery": 209.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 54.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 3.0, "data_gap_recovery_stateful": 39.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 45.0, "inconsistent_api_recovery_stateful": 217.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 55.27, "argument_fidelity": 158.21, "tool_selection": 154.48, "basic_2step": 78.95, "sequential_3step": 151.71, "conditional_routing": 718.02, "sequential_reasoning": 280.43, "error_recovery": 485.44, "data_gap_recovery": 788.85, "data_gap_recovery_extended": 841.04, "argument_transformation": 1439.71, "grounded_synthesis": 1162.08, "inconsistent_api_recovery": 2036.7, "relevance_detection_stateful": 50.18, "argument_fidelity_stateful": 153.37, "tool_selection_stateful": 149.35, "basic_2step_stateful": 75.02, "sequential_3step_stateful": 146.65, "conditional_routing_stateful": 688.24, "sequential_reasoning_stateful": 277.67, "error_recovery_stateful": 404.68, "data_gap_recovery_stateful": 797.17, "data_gap_recovery_extended_stateful": 824.11, "argument_transformation_stateful": 1401.31, "grounded_synthesis_stateful": 1256.62, "inconsistent_api_recovery_stateful": 1978.96}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma4:e4b-it-q4_K_M OL/N [reforged]", "model": "gemma4:e4b-it-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 74.8, "accuracy": 75.0, "completeness": 99.8, "efficiency": 82.6, "wasted": 0.8, "speed": 11.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 92, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 44, "inconsistent_api_recovery": 66, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 90, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 78, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 42}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 46, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 22, "inconsistent_api_recovery": 33, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 21}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 230, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 220, "inconsistent_api_recovery": 264, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 180, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 195, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 200, "inconsistent_api_recovery_stateful": 168}, "scenarioActualCalls": {"relevance_detection": 51, "argument_fidelity": 150, "tool_selection": 155, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 237, "sequential_reasoning": 200, "error_recovery": 157, "data_gap_recovery": 269, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 321, "inconsistent_api_recovery": 446, "relevance_detection_stateful": 52, "argument_fidelity_stateful": 150, "tool_selection_stateful": 157, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 227, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 153, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 283, "inconsistent_api_recovery_stateful": 289}, "scenarioWastedSum": {"relevance_detection": 1.0, "argument_fidelity": 0.0, "tool_selection": 5.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 49.0, "sequential_reasoning": 0.0, "error_recovery": 57.0, "data_gap_recovery": 45.0, "data_gap_recovery_extended": 16.0, "argument_transformation": 0.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 253.0, "relevance_detection_stateful": 2.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 7.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 47.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 3.0, "data_gap_recovery_stateful": 67.0, "data_gap_recovery_extended_stateful": 3.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 139.0, "inconsistent_api_recovery_stateful": 246.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 107.13, "argument_fidelity": 106.86, "tool_selection": 121.34, "basic_2step": 76.55, "sequential_3step": 110.42, "conditional_routing": 469.76, "sequential_reasoning": 210.93, "error_recovery": 146.49, "data_gap_recovery": 699.04, "data_gap_recovery_extended": 788.71, "argument_transformation": 1014.45, "grounded_synthesis": 927.35, "inconsistent_api_recovery": 1715.81, "relevance_detection_stateful": 122.87, "argument_fidelity_stateful": 132.41, "tool_selection_stateful": 155.18, "basic_2step_stateful": 69.27, "sequential_3step_stateful": 137.81, "conditional_routing_stateful": 568.16, "sequential_reasoning_stateful": 245.26, "error_recovery_stateful": 179.63, "data_gap_recovery_stateful": 983.12, "data_gap_recovery_extended_stateful": 880.46, "argument_transformation_stateful": 1347.08, "grounded_synthesis_stateful": 1011.46, "inconsistent_api_recovery_stateful": 2345.1}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "ministral-3:14b-instruct-2512-q4_K_M OL/N [reforged]", "model": "ministral-3:14b-instruct-2512-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "ministral-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 74.8, "accuracy": 74.8, "completeness": 100.0, "efficiency": 81.3, "wasted": 1.0, "speed": 6.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 96, "data_gap_recovery_extended": 56, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 80, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 16}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 28, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 8}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 240, "data_gap_recovery_extended": 224, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 320, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 64}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 150, "sequential_3step": 150, "conditional_routing": 252, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 319, "data_gap_recovery_extended": 312, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 417, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 150, "sequential_3step_stateful": 150, "conditional_routing_stateful": 264, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 336, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 111}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 50.0, "sequential_3step": 0.0, "conditional_routing": 52.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 81.0, "data_gap_recovery_extended": 167.0, "argument_transformation": 120.0, "grounded_synthesis": 21.0, "inconsistent_api_recovery": 131.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 64.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 92.0, "data_gap_recovery_extended_stateful": 10.0, "argument_transformation_stateful": 115.0, "grounded_synthesis_stateful": 31.0, "inconsistent_api_recovery_stateful": 165.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 22.88, "argument_fidelity": 72.85, "tool_selection": 45.85, "basic_2step": 52.41, "sequential_3step": 97.11, "conditional_routing": 431.22, "sequential_reasoning": 197.04, "error_recovery": 35.26, "data_gap_recovery": 364.63, "data_gap_recovery_extended": 631.08, "argument_transformation": 591.71, "grounded_synthesis": 706.31, "inconsistent_api_recovery": 873.37, "relevance_detection_stateful": 22.94, "argument_fidelity_stateful": 72.97, "tool_selection_stateful": 45.95, "basic_2step_stateful": 41.57, "sequential_3step_stateful": 111.88, "conditional_routing_stateful": 455.42, "sequential_reasoning_stateful": 196.41, "error_recovery_stateful": 35.4, "data_gap_recovery_stateful": 238.11, "data_gap_recovery_extended_stateful": 411.71, "argument_transformation_stateful": 641.57, "grounded_synthesis_stateful": 727.5, "inconsistent_api_recovery_stateful": 921.98}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma4:e4b-it-q8_0 OL/N [reforged]", "model": "gemma4:e4b-it-q8_0", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "gemma4-e4b", "quant": "q8_0", "gen": 2, "retired": false, "score": 73.6, "accuracy": 73.8, "completeness": 99.8, "efficiency": 85.3, "wasted": 0.8, "speed": 12.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 78, "sequential_reasoning": 98, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 34, "inconsistent_api_recovery": 60, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 78, "sequential_reasoning_stateful": 94, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 96, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 34}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 39, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 17, "inconsistent_api_recovery": 30, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 39, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 17, "inconsistent_api_recovery_stateful": 17}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 156, "sequential_reasoning": 196, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 20, "grounded_synthesis": 170, "inconsistent_api_recovery": 240, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 156, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 240, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 170, "inconsistent_api_recovery_stateful": 136}, "scenarioActualCalls": {"relevance_detection": 52, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 193, "sequential_reasoning": 196, "error_recovery": 157, "data_gap_recovery": 275, "data_gap_recovery_extended": 0, "argument_transformation": 19, "grounded_synthesis": 251, "inconsistent_api_recovery": 368, "relevance_detection_stateful": 51, "argument_fidelity_stateful": 150, "tool_selection_stateful": 151, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 194, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 154, "data_gap_recovery_stateful": 259, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 282, "inconsistent_api_recovery_stateful": 212}, "scenarioWastedSum": {"relevance_detection": 2.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 37.0, "sequential_reasoning": 0.0, "error_recovery": 57.0, "data_gap_recovery": 33.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 163.0, "inconsistent_api_recovery": 204.0, "relevance_detection_stateful": 1.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 1.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 39.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 4.0, "data_gap_recovery_stateful": 27.0, "data_gap_recovery_extended_stateful": 14.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 193.0, "inconsistent_api_recovery_stateful": 205.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 120.14, "argument_fidelity": 135.63, "tool_selection": 148.33, "basic_2step": 82.98, "sequential_3step": 149.89, "conditional_routing": 572.98, "sequential_reasoning": 241.63, "error_recovery": 211.98, "data_gap_recovery": 749.8, "data_gap_recovery_extended": 857.51, "argument_transformation": 1557.36, "grounded_synthesis": 1195.18, "inconsistent_api_recovery": 2237.69, "relevance_detection_stateful": 121.56, "argument_fidelity_stateful": 141.51, "tool_selection_stateful": 136.38, "basic_2step_stateful": 76.93, "sequential_3step_stateful": 155.35, "conditional_routing_stateful": 558.35, "sequential_reasoning_stateful": 239.51, "error_recovery_stateful": 197.59, "data_gap_recovery_stateful": 780.02, "data_gap_recovery_extended_stateful": 881.04, "argument_transformation_stateful": 1517.79, "grounded_synthesis_stateful": 1292.08, "inconsistent_api_recovery_stateful": 2178.44}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "claude-haiku-4-5-20251001 AN/N [bare+any]", "model": "claude-haiku-4-5-20251001", "backend": "anthropic", "mode": "native", "ablation": "bare", "family": "claude", "quant": "n/a", "gen": 1, "retired": false, "score": 74.0, "accuracy": 80.2, "completeness": 92.3, "efficiency": 100.0, "wasted": 0.0, "speed": 5.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 92, "argument_transformation": 0, "grounded_synthesis": 22, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 82, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 46, "argument_transformation": 0, "grounded_synthesis": 11, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 41, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 13, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 368, "argument_transformation": 0, "grounded_synthesis": 110, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 328, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 130, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 100, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 153, "data_gap_recovery_extended": 186, "argument_transformation": 0, "grounded_synthesis": 43, "inconsistent_api_recovery": 200, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 156, "data_gap_recovery_extended_stateful": 170, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 59, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 14.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 8.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 69.02, "argument_fidelity": 125.55, "tool_selection": 127.99, "basic_2step": 114.15, "sequential_3step": 216.54, "conditional_routing": 232.84, "sequential_reasoning": 225.11, "error_recovery": 0.0, "data_gap_recovery": 246.26, "data_gap_recovery_extended": 386.39, "argument_transformation": 309.38, "grounded_synthesis": 591.31, "inconsistent_api_recovery": 322.3, "relevance_detection_stateful": 60.29, "argument_fidelity_stateful": 239.66, "tool_selection_stateful": 253.85, "basic_2step_stateful": 290.34, "sequential_3step_stateful": 238.67, "conditional_routing_stateful": 234.28, "sequential_reasoning_stateful": 299.16, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 341.03, "data_gap_recovery_extended_stateful": 397.48, "argument_transformation_stateful": 404.67, "grounded_synthesis_stateful": 495.6, "inconsistent_api_recovery_stateful": 355.92}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-27B-Q4_K_M LS/P [bare]", "model": "Qwen3.5-27B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "qwen3.5-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 74.3, "accuracy": 81.0, "completeness": 91.8, "efficiency": 100.0, "wasted": 0.0, "speed": 24.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 92, "data_gap_recovery_extended": 38, "argument_transformation": 14, "grounded_synthesis": 84, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 68, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 88, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 19, "argument_transformation": 7, "grounded_synthesis": 42, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 22, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 230, "data_gap_recovery_extended": 152, "argument_transformation": 35, "grounded_synthesis": 420, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 136, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 176, "argument_transformation_stateful": 15, "grounded_synthesis_stateful": 440, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 214, "data_gap_recovery_extended": 77, "argument_transformation": 30, "grounded_synthesis": 258, "inconsistent_api_recovery": 221, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 122, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 223, "data_gap_recovery_extended_stateful": 88, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 255, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 8.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 1.0, "grounded_synthesis": 5.0, "inconsistent_api_recovery": 6.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 5.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 6.0, "inconsistent_api_recovery_stateful": 2.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 205.2, "argument_fidelity": 492.05, "tool_selection": 339.76, "basic_2step": 254.0, "sequential_3step": 495.57, "conditional_routing": 1160.72, "sequential_reasoning": 678.59, "error_recovery": 0.0, "data_gap_recovery": 1346.02, "data_gap_recovery_extended": 1533.84, "argument_transformation": 2894.93, "grounded_synthesis": 2945.57, "inconsistent_api_recovery": 1920.33, "relevance_detection_stateful": 211.42, "argument_fidelity_stateful": 497.9, "tool_selection_stateful": 331.17, "basic_2step_stateful": 284.87, "sequential_3step_stateful": 472.13, "conditional_routing_stateful": 1167.08, "sequential_reasoning_stateful": 657.21, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1422.21, "data_gap_recovery_extended_stateful": 1584.39, "argument_transformation_stateful": 2718.37, "grounded_synthesis_stateful": 2977.57, "inconsistent_api_recovery_stateful": 1989.43}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q8_0 LS/P [reforged]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "qwen3-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 73.1, "accuracy": 73.2, "completeness": 99.8, "efficiency": 88.8, "wasted": 0.4, "speed": 28.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 58, "data_gap_recovery": 96, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 28, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 96, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 64, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 58}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 29, "data_gap_recovery": 48, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 14, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 32, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 29}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 58, "data_gap_recovery": 240, "data_gap_recovery_extended": 0, "argument_transformation": 20, "grounded_synthesis": 140, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 96, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 60, "inconsistent_api_recovery_stateful": 232}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 152, "basic_2step": 101, "sequential_3step": 153, "conditional_routing": 265, "sequential_reasoning": 200, "error_recovery": 88, "data_gap_recovery": 244, "data_gap_recovery_extended": 0, "argument_transformation": 16, "grounded_synthesis": 123, "inconsistent_api_recovery": 536, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 153, "basic_2step_stateful": 100, "sequential_3step_stateful": 146, "conditional_routing_stateful": 258, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 96, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 59, "inconsistent_api_recovery_stateful": 348}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 2.0, "basic_2step": 1.0, "sequential_3step": 3.0, "conditional_routing": 65.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 5.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 0.0, "grounded_synthesis": 6.0, "inconsistent_api_recovery": 164.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 3.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 4.0, "conditional_routing_stateful": 58.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 11.0, "data_gap_recovery_extended_stateful": 2.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 8.0, "inconsistent_api_recovery_stateful": 148.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 81.37, "argument_fidelity": 298.31, "tool_selection": 362.06, "basic_2step": 275.65, "sequential_3step": 694.01, "conditional_routing": 1284.98, "sequential_reasoning": 483.33, "error_recovery": 757.85, "data_gap_recovery": 977.4, "data_gap_recovery_extended": 1527.06, "argument_transformation": 2714.51, "grounded_synthesis": 3722.01, "inconsistent_api_recovery": 5374.96, "relevance_detection_stateful": 75.08, "argument_fidelity_stateful": 295.87, "tool_selection_stateful": 361.19, "basic_2step_stateful": 312.17, "sequential_3step_stateful": 705.38, "conditional_routing_stateful": 1303.11, "sequential_reasoning_stateful": 460.93, "error_recovery_stateful": 780.64, "data_gap_recovery_stateful": 1035.62, "data_gap_recovery_extended_stateful": 1546.95, "argument_transformation_stateful": 2618.29, "grounded_synthesis_stateful": 3643.71, "inconsistent_api_recovery_stateful": 5112.16}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "phi-4-Q4_K_M LS/P [reforged]", "model": "phi-4-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "phi-4", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 72.9, "accuracy": 73.3, "completeness": 99.5, "efficiency": 84.5, "wasted": 0.9, "speed": 4.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 34, "sequential_reasoning": 56, "error_recovery": 96, "data_gap_recovery": 90, "data_gap_recovery_extended": 52, "argument_transformation": 24, "grounded_synthesis": 38, "inconsistent_api_recovery": 70, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 24, "sequential_reasoning_stateful": 62, "error_recovery_stateful": 92, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 52, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 60, "inconsistent_api_recovery_stateful": 48}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 17, "sequential_reasoning": 28, "error_recovery": 48, "data_gap_recovery": 45, "data_gap_recovery_extended": 26, "argument_transformation": 12, "grounded_synthesis": 19, "inconsistent_api_recovery": 35, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 12, "sequential_reasoning_stateful": 31, "error_recovery_stateful": 46, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 26, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 24}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 68, "sequential_reasoning": 112, "error_recovery": 96, "data_gap_recovery": 225, "data_gap_recovery_extended": 208, "argument_transformation": 60, "grounded_synthesis": 190, "inconsistent_api_recovery": 280, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 124, "error_recovery_stateful": 138, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 208, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 300, "inconsistent_api_recovery_stateful": 192}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 90, "sequential_reasoning": 112, "error_recovery": 149, "data_gap_recovery": 290, "data_gap_recovery_extended": 175, "argument_transformation": 42, "grounded_synthesis": 315, "inconsistent_api_recovery": 406, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 102, "sequential_3step_stateful": 150, "conditional_routing_stateful": 61, "sequential_reasoning_stateful": 124, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 322, "data_gap_recovery_extended_stateful": 184, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 472, "inconsistent_api_recovery_stateful": 283}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 26.0, "sequential_reasoning": 0.0, "error_recovery": 57.0, "data_gap_recovery": 66.0, "data_gap_recovery_extended": 2.0, "argument_transformation": 2.0, "grounded_synthesis": 324.0, "inconsistent_api_recovery": 141.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 2.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 27.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 9.0, "data_gap_recovery_stateful": 79.0, "data_gap_recovery_extended_stateful": 5.0, "argument_transformation_stateful": 6.0, "grounded_synthesis_stateful": 287.0, "inconsistent_api_recovery_stateful": 135.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 22.74, "argument_fidelity": 53.62, "tool_selection": 45.23, "basic_2step": 33.22, "sequential_3step": 49.07, "conditional_routing": 253.47, "sequential_reasoning": 77.3, "error_recovery": 49.66, "data_gap_recovery": 227.4, "data_gap_recovery_extended": 261.31, "argument_transformation": 612.17, "grounded_synthesis": 547.31, "inconsistent_api_recovery": 403.34, "relevance_detection_stateful": 22.59, "argument_fidelity_stateful": 54.87, "tool_selection_stateful": 45.37, "basic_2step_stateful": 43.12, "sequential_3step_stateful": 49.08, "conditional_routing_stateful": 240.79, "sequential_reasoning_stateful": 79.82, "error_recovery_stateful": 54.12, "data_gap_recovery_stateful": 252.92, "data_gap_recovery_extended_stateful": 289.38, "argument_transformation_stateful": 583.0, "grounded_synthesis_stateful": 606.48, "inconsistent_api_recovery_stateful": 408.57}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/P [reforged]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 72.8, "accuracy": 72.8, "completeness": 100.0, "efficiency": 85.4, "wasted": 0.6, "speed": 8.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 38, "sequential_reasoning": 98, "error_recovery": 94, "data_gap_recovery": 94, "data_gap_recovery_extended": 0, "argument_transformation": 18, "grounded_synthesis": 26, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 26, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 92, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 22, "inconsistent_api_recovery_stateful": 90}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 19, "sequential_reasoning": 49, "error_recovery": 47, "data_gap_recovery": 47, "data_gap_recovery_extended": 0, "argument_transformation": 9, "grounded_synthesis": 13, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 13, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 11, "inconsistent_api_recovery_stateful": 45}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 76, "sequential_reasoning": 196, "error_recovery": 94, "data_gap_recovery": 235, "data_gap_recovery_extended": 0, "argument_transformation": 45, "grounded_synthesis": 130, "inconsistent_api_recovery": 384, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 52, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 110, "inconsistent_api_recovery_stateful": 360}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 89, "sequential_reasoning": 196, "error_recovery": 144, "data_gap_recovery": 253, "data_gap_recovery_extended": 0, "argument_transformation": 37, "grounded_synthesis": 192, "inconsistent_api_recovery": 584, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 70, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 151, "data_gap_recovery_stateful": 242, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 141, "inconsistent_api_recovery_stateful": 557}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 15.0, "sequential_reasoning": 0.0, "error_recovery": 54.0, "data_gap_recovery": 25.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 119.0, "inconsistent_api_recovery": 207.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 18.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 1.0, "data_gap_recovery_stateful": 19.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 94.0, "inconsistent_api_recovery_stateful": 214.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 30.13, "argument_fidelity": 89.2, "tool_selection": 104.91, "basic_2step": 47.13, "sequential_3step": 101.99, "conditional_routing": 452.37, "sequential_reasoning": 181.69, "error_recovery": 242.23, "data_gap_recovery": 502.2, "data_gap_recovery_extended": 456.57, "argument_transformation": 1118.26, "grounded_synthesis": 866.14, "inconsistent_api_recovery": 1348.14, "relevance_detection_stateful": 29.41, "argument_fidelity_stateful": 87.92, "tool_selection_stateful": 107.48, "basic_2step_stateful": 45.02, "sequential_3step_stateful": 101.02, "conditional_routing_stateful": 454.27, "sequential_reasoning_stateful": 169.33, "error_recovery_stateful": 192.52, "data_gap_recovery_stateful": 494.41, "data_gap_recovery_extended_stateful": 438.28, "argument_transformation_stateful": 1173.02, "grounded_synthesis_stateful": 858.38, "inconsistent_api_recovery_stateful": 1309.65}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/N [reforged]", "model": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "mistral-small-3.2", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 71.0, "accuracy": 71.2, "completeness": 99.8, "efficiency": 95.6, "wasted": 0.5, "speed": 6.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 58, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 68, "data_gap_recovery_extended": 4, "argument_transformation": 12, "grounded_synthesis": 20, "inconsistent_api_recovery": 90, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 22, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 29, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 34, "data_gap_recovery_extended": 2, "argument_transformation": 6, "grounded_synthesis": 10, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 25, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 116, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 170, "data_gap_recovery_extended": 16, "argument_transformation": 30, "grounded_synthesis": 100, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 88, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 116, "sequential_reasoning": 257, "error_recovery": 153, "data_gap_recovery": 273, "data_gap_recovery_extended": 7, "argument_transformation": 21, "grounded_synthesis": 62, "inconsistent_api_recovery": 315, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 99, "sequential_reasoning_stateful": 252, "error_recovery_stateful": 152, "data_gap_recovery_stateful": 21, "data_gap_recovery_extended_stateful": 62, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 51, "inconsistent_api_recovery_stateful": 352}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 57.0, "error_recovery": 53.0, "data_gap_recovery": 141.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 5.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 52.0, "error_recovery_stateful": 2.0, "data_gap_recovery_stateful": 187.0, "data_gap_recovery_extended_stateful": 7.0, "argument_transformation_stateful": 2.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 55.24, "argument_fidelity": 88.3, "tool_selection": 57.73, "basic_2step": 40.14, "sequential_3step": 85.65, "conditional_routing": 381.28, "sequential_reasoning": 377.74, "error_recovery": 201.02, "data_gap_recovery": 673.06, "data_gap_recovery_extended": 487.57, "argument_transformation": 724.09, "grounded_synthesis": 765.34, "inconsistent_api_recovery": 484.16, "relevance_detection_stateful": 60.08, "argument_fidelity_stateful": 88.18, "tool_selection_stateful": 57.76, "basic_2step_stateful": 49.02, "sequential_3step_stateful": 85.15, "conditional_routing_stateful": 356.47, "sequential_reasoning_stateful": 382.39, "error_recovery_stateful": 154.74, "data_gap_recovery_stateful": 374.2, "data_gap_recovery_extended_stateful": 546.5, "argument_transformation_stateful": 681.13, "grounded_synthesis_stateful": 734.01, "inconsistent_api_recovery_stateful": 471.27}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-14B-Q4_K_M LS/P [reforged]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 70.5, "accuracy": 70.8, "completeness": 99.7, "efficiency": 85.9, "wasted": 0.5, "speed": 24.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 100, "error_recovery": 64, "data_gap_recovery": 68, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 32, "inconsistent_api_recovery": 72, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 98, "conditional_routing_stateful": 94, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 58, "data_gap_recovery_stateful": 66, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 58}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 32, "data_gap_recovery": 34, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 16, "inconsistent_api_recovery": 36, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 29, "data_gap_recovery_stateful": 33, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 15, "inconsistent_api_recovery_stateful": 29}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 200, "error_recovery": 64, "data_gap_recovery": 170, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 160, "inconsistent_api_recovery": 288, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 188, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 87, "data_gap_recovery_stateful": 165, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 232}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 189, "basic_2step": 105, "sequential_3step": 150, "conditional_routing": 255, "sequential_reasoning": 200, "error_recovery": 96, "data_gap_recovery": 160, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 226, "inconsistent_api_recovery": 349, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 194, "basic_2step_stateful": 107, "sequential_3step_stateful": 148, "conditional_routing_stateful": 259, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 87, "data_gap_recovery_stateful": 157, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 221, "inconsistent_api_recovery_stateful": 327}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 39.0, "basic_2step": 5.0, "sequential_3step": 0.0, "conditional_routing": 69.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 99.0, "inconsistent_api_recovery": 71.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 44.0, "basic_2step_stateful": 7.0, "sequential_3step_stateful": 1.0, "conditional_routing_stateful": 73.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 2.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 125.0, "inconsistent_api_recovery_stateful": 109.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioSpeedSum": {"relevance_detection": 77.59, "argument_fidelity": 307.97, "tool_selection": 504.16, "basic_2step": 153.55, "sequential_3step": 437.42, "conditional_routing": 1270.84, "sequential_reasoning": 466.6, "error_recovery": 457.4, "data_gap_recovery": 836.3, "data_gap_recovery_extended": 1276.0, "argument_transformation": 2315.56, "grounded_synthesis": 4227.72, "inconsistent_api_recovery": 3054.18, "relevance_detection_stateful": 77.9, "argument_fidelity_stateful": 305.89, "tool_selection_stateful": 499.61, "basic_2step_stateful": 172.79, "sequential_3step_stateful": 487.99, "conditional_routing_stateful": 1291.44, "sequential_reasoning_stateful": 469.35, "error_recovery_stateful": 495.33, "data_gap_recovery_stateful": 855.05, "data_gap_recovery_extended_stateful": 1324.68, "argument_transformation_stateful": 2459.77, "grounded_synthesis_stateful": 4134.75, "inconsistent_api_recovery_stateful": 3408.49}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}}, {"label": "ministral-3:8b-instruct-2512-q8_0 OL/N [reforged]", "model": "ministral-3:8b-instruct-2512-q8_0", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "ministral-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 70.7, "accuracy": 74.5, "completeness": 94.9, "efficiency": 73.6, "wasted": 1.1, "speed": 5.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 92, "data_gap_recovery_extended": 12, "argument_transformation": 42, "grounded_synthesis": 0, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 26, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 12}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 46, "data_gap_recovery_extended": 6, "argument_transformation": 21, "grounded_synthesis": 0, "inconsistent_api_recovery": 21, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 13, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 6}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 21, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 15}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 21, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 15}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 230, "data_gap_recovery_extended": 48, "argument_transformation": 105, "grounded_synthesis": 0, "inconsistent_api_recovery": 168, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 65, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 48}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 200, "tool_selection": 200, "basic_2step": 150, "sequential_3step": 164, "conditional_routing": 260, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 347, "data_gap_recovery_extended": 50, "argument_transformation": 142, "grounded_synthesis": 0, "inconsistent_api_recovery": 296, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 200, "tool_selection_stateful": 200, "basic_2step_stateful": 150, "sequential_3step_stateful": 158, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 114, "data_gap_recovery_extended_stateful": 17, "argument_transformation_stateful": 25, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 87}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 50.0, "tool_selection": 50.0, "basic_2step": 50.0, "sequential_3step": 14.0, "conditional_routing": 60.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 129.0, "data_gap_recovery_extended": 4.0, "argument_transformation": 98.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 128.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 50.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 8.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 134.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 84.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 82.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 21, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 15}, "scenarioSpeedSum": {"relevance_detection": 51.13, "argument_fidelity": 111.05, "tool_selection": 87.09, "basic_2step": 53.1, "sequential_3step": 74.19, "conditional_routing": 224.51, "sequential_reasoning": 259.91, "error_recovery": 115.73, "data_gap_recovery": 657.06, "data_gap_recovery_extended": 498.6, "argument_transformation": 727.93, "grounded_synthesis": 566.4, "inconsistent_api_recovery": 386.35, "relevance_detection_stateful": 51.32, "argument_fidelity_stateful": 110.79, "tool_selection_stateful": 86.38, "basic_2step_stateful": 47.76, "sequential_3step_stateful": 62.47, "conditional_routing_stateful": 196.98, "sequential_reasoning_stateful": 259.06, "error_recovery_stateful": 89.73, "data_gap_recovery_stateful": 533.38, "data_gap_recovery_extended_stateful": 489.69, "argument_transformation_stateful": 681.24, "grounded_synthesis_stateful": 579.42, "inconsistent_api_recovery_stateful": 269.45}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 21, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 15}}, {"label": "Nemotron-3-Nano-30B-A3B-Q4_K_M LS/N [reforged]", "model": "Nemotron-3-Nano-30B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "nemotron-3-nano", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 71.3, "accuracy": 81.0, "completeness": 88.0, "efficiency": 72.3, "wasted": 1.5, "speed": 21.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 66, "sequential_reasoning": 98, "error_recovery": 52, "data_gap_recovery": 92, "data_gap_recovery_extended": 28, "argument_transformation": 4, "grounded_synthesis": 34, "inconsistent_api_recovery": 34, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 98, "sequential_3step_stateful": 100, "conditional_routing_stateful": 86, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 68, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 38}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 33, "sequential_reasoning": 49, "error_recovery": 26, "data_gap_recovery": 46, "data_gap_recovery_extended": 14, "argument_transformation": 2, "grounded_synthesis": 17, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 49, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 34, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 17, "inconsistent_api_recovery_stateful": 19}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 36, "data_gap_recovery": 46, "data_gap_recovery_extended": 50, "argument_transformation": 19, "grounded_synthesis": 49, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 20}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 36, "data_gap_recovery": 46, "data_gap_recovery_extended": 50, "argument_transformation": 19, "grounded_synthesis": 49, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 20}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 132, "sequential_reasoning": 196, "error_recovery": 52, "data_gap_recovery": 230, "data_gap_recovery_extended": 112, "argument_transformation": 10, "grounded_synthesis": 170, "inconsistent_api_recovery": 136, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 98, "sequential_3step_stateful": 150, "conditional_routing_stateful": 172, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 102, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 96, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 170, "inconsistent_api_recovery_stateful": 152}, "scenarioActualCalls": {"relevance_detection": 92, "argument_fidelity": 151, "tool_selection": 160, "basic_2step": 127, "sequential_3step": 157, "conditional_routing": 174, "sequential_reasoning": 302, "error_recovery": 84, "data_gap_recovery": 325, "data_gap_recovery_extended": 126, "argument_transformation": 22, "grounded_synthesis": 276, "inconsistent_api_recovery": 277, "relevance_detection_stateful": 85, "argument_fidelity_stateful": 150, "tool_selection_stateful": 155, "basic_2step_stateful": 124, "sequential_3step_stateful": 153, "conditional_routing_stateful": 233, "sequential_reasoning_stateful": 283, "error_recovery_stateful": 113, "data_gap_recovery_stateful": 345, "data_gap_recovery_extended_stateful": 117, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 292, "inconsistent_api_recovery_stateful": 300}, "scenarioWastedSum": {"relevance_detection": 42.0, "argument_fidelity": 1.0, "tool_selection": 10.0, "basic_2step": 27.0, "sequential_3step": 7.0, "conditional_routing": 47.0, "sequential_reasoning": 106.0, "error_recovery": 44.0, "data_gap_recovery": 95.0, "data_gap_recovery_extended": 79.0, "argument_transformation": 100.0, "grounded_synthesis": 196.0, "inconsistent_api_recovery": 141.0, "relevance_detection_stateful": 35.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 5.0, "basic_2step_stateful": 26.0, "sequential_3step_stateful": 3.0, "conditional_routing_stateful": 61.0, "sequential_reasoning_stateful": 102.0, "error_recovery_stateful": 16.0, "data_gap_recovery_stateful": 100.0, "data_gap_recovery_extended_stateful": 76.0, "argument_transformation_stateful": 106.0, "grounded_synthesis_stateful": 190.0, "inconsistent_api_recovery_stateful": 155.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 36, "data_gap_recovery": 46, "data_gap_recovery_extended": 50, "argument_transformation": 19, "grounded_synthesis": 49, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 20}, "scenarioSpeedSum": {"relevance_detection": 126.82, "argument_fidelity": 178.91, "tool_selection": 228.93, "basic_2step": 245.29, "sequential_3step": 244.92, "conditional_routing": 560.68, "sequential_reasoning": 1373.58, "error_recovery": 422.83, "data_gap_recovery": 1448.98, "data_gap_recovery_extended": 1628.44, "argument_transformation": 2589.83, "grounded_synthesis": 1477.52, "inconsistent_api_recovery": 1578.41, "relevance_detection_stateful": 124.08, "argument_fidelity_stateful": 181.64, "tool_selection_stateful": 207.7, "basic_2step_stateful": 191.81, "sequential_3step_stateful": 221.1, "conditional_routing_stateful": 589.65, "sequential_reasoning_stateful": 1295.08, "error_recovery_stateful": 568.14, "data_gap_recovery_stateful": 1337.71, "data_gap_recovery_extended_stateful": 1842.25, "argument_transformation_stateful": 2618.18, "grounded_synthesis_stateful": 1327.51, "inconsistent_api_recovery_stateful": 1893.77}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 36, "data_gap_recovery": 46, "data_gap_recovery_extended": 50, "argument_transformation": 19, "grounded_synthesis": 49, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 20}}, {"label": "Qwen3-8B-Q8_0 LS/N [reforged]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "qwen3-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 70.3, "accuracy": 70.5, "completeness": 99.7, "efficiency": 88.2, "wasted": 0.6, "speed": 24.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 60, "data_gap_recovery": 82, "data_gap_recovery_extended": 4, "argument_transformation": 22, "grounded_synthesis": 20, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 98, "conditional_routing_stateful": 94, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 58, "data_gap_recovery_stateful": 66, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 50}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 30, "data_gap_recovery": 41, "data_gap_recovery_extended": 2, "argument_transformation": 11, "grounded_synthesis": 10, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 29, "data_gap_recovery_stateful": 33, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 14, "inconsistent_api_recovery_stateful": 25}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 60, "data_gap_recovery": 205, "data_gap_recovery_extended": 16, "argument_transformation": 55, "grounded_synthesis": 100, "inconsistent_api_recovery": 128, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 188, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 87, "data_gap_recovery_stateful": 165, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 140, "inconsistent_api_recovery_stateful": 200}, "scenarioActualCalls": {"relevance_detection": 57, "argument_fidelity": 169, "tool_selection": 204, "basic_2step": 108, "sequential_3step": 155, "conditional_routing": 253, "sequential_reasoning": 210, "error_recovery": 106, "data_gap_recovery": 196, "data_gap_recovery_extended": 8, "argument_transformation": 61, "grounded_synthesis": 73, "inconsistent_api_recovery": 170, "relevance_detection_stateful": 62, "argument_fidelity_stateful": 170, "tool_selection_stateful": 201, "basic_2step_stateful": 106, "sequential_3step_stateful": 156, "conditional_routing_stateful": 238, "sequential_reasoning_stateful": 213, "error_recovery_stateful": 98, "data_gap_recovery_stateful": 155, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 108, "inconsistent_api_recovery_stateful": 275}, "scenarioWastedSum": {"relevance_detection": 7.0, "argument_fidelity": 19.0, "tool_selection": 54.0, "basic_2step": 8.0, "sequential_3step": 5.0, "conditional_routing": 55.0, "sequential_reasoning": 10.0, "error_recovery": 84.0, "data_gap_recovery": 17.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 66.0, "grounded_synthesis": 19.0, "inconsistent_api_recovery": 58.0, "relevance_detection_stateful": 12.0, "argument_fidelity_stateful": 20.0, "tool_selection_stateful": 51.0, "basic_2step_stateful": 6.0, "sequential_3step_stateful": 10.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 13.0, "error_recovery_stateful": 31.0, "data_gap_recovery_stateful": 16.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 69.0, "grounded_synthesis_stateful": 24.0, "inconsistent_api_recovery_stateful": 85.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 181.83, "argument_fidelity": 528.43, "tool_selection": 484.01, "basic_2step": 262.34, "sequential_3step": 756.69, "conditional_routing": 1308.45, "sequential_reasoning": 761.26, "error_recovery": 890.92, "data_gap_recovery": 1202.58, "data_gap_recovery_extended": 1303.64, "argument_transformation": 2860.86, "grounded_synthesis": 2195.43, "inconsistent_api_recovery": 2614.79, "relevance_detection_stateful": 207.61, "argument_fidelity_stateful": 550.24, "tool_selection_stateful": 493.23, "basic_2step_stateful": 296.19, "sequential_3step_stateful": 844.22, "conditional_routing_stateful": 1371.02, "sequential_reasoning_stateful": 828.73, "error_recovery_stateful": 748.2, "data_gap_recovery_stateful": 1055.67, "data_gap_recovery_extended_stateful": 1347.76, "argument_transformation_stateful": 2948.29, "grounded_synthesis_stateful": 2244.97, "inconsistent_api_recovery_stateful": 2953.96}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q4_K_M LS/P [reforged]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 70.4, "accuracy": 70.7, "completeness": 99.6, "efficiency": 85.8, "wasted": 0.5, "speed": 17.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 100, "error_recovery": 56, "data_gap_recovery": 64, "data_gap_recovery_extended": 0, "argument_transformation": 14, "grounded_synthesis": 12, "inconsistent_api_recovery": 92, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 94, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 62, "data_gap_recovery_stateful": 58, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 78}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 28, "data_gap_recovery": 32, "data_gap_recovery_extended": 0, "argument_transformation": 7, "grounded_synthesis": 6, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 31, "data_gap_recovery_stateful": 29, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 39}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 200, "error_recovery": 56, "data_gap_recovery": 160, "data_gap_recovery_extended": 0, "argument_transformation": 35, "grounded_synthesis": 60, "inconsistent_api_recovery": 368, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 188, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 93, "data_gap_recovery_stateful": 145, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 312}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 178, "basic_2step": 104, "sequential_3step": 150, "conditional_routing": 242, "sequential_reasoning": 201, "error_recovery": 84, "data_gap_recovery": 156, "data_gap_recovery_extended": 0, "argument_transformation": 27, "grounded_synthesis": 53, "inconsistent_api_recovery": 562, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 170, "basic_2step_stateful": 102, "sequential_3step_stateful": 150, "conditional_routing_stateful": 249, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 93, "data_gap_recovery_stateful": 141, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 483}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 28.0, "basic_2step": 4.0, "sequential_3step": 0.0, "conditional_routing": 54.0, "sequential_reasoning": 1.0, "error_recovery": 50.0, "data_gap_recovery": 6.0, "data_gap_recovery_extended": 8.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 194.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 20.0, "basic_2step_stateful": 2.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 61.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 1.0, "data_gap_recovery_stateful": 4.0, "data_gap_recovery_extended_stateful": 8.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 182.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioSpeedSum": {"relevance_detection": 78.3, "argument_fidelity": 194.48, "tool_selection": 375.63, "basic_2step": 179.33, "sequential_3step": 434.59, "conditional_routing": 922.14, "sequential_reasoning": 354.42, "error_recovery": 425.06, "data_gap_recovery": 730.83, "data_gap_recovery_extended": 1115.32, "argument_transformation": 1691.8, "grounded_synthesis": 1902.04, "inconsistent_api_recovery": 3383.33, "relevance_detection_stateful": 73.86, "argument_fidelity_stateful": 193.67, "tool_selection_stateful": 298.83, "basic_2step_stateful": 252.38, "sequential_3step_stateful": 425.91, "conditional_routing_stateful": 953.34, "sequential_reasoning_stateful": 355.37, "error_recovery_stateful": 434.35, "data_gap_recovery_stateful": 714.73, "data_gap_recovery_extended_stateful": 1119.07, "argument_transformation_stateful": 1590.39, "grounded_synthesis_stateful": 1585.6, "inconsistent_api_recovery_stateful": 3287.29}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}}, {"label": "Nemotron-3-Nano-30B-A3B-Q4_K_M LS/P [reforged]", "model": "Nemotron-3-Nano-30B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "nemotron-3-nano", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 70.2, "accuracy": 70.7, "completeness": 99.4, "efficiency": 88.7, "wasted": 0.4, "speed": 10.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 52, "sequential_reasoning": 100, "error_recovery": 84, "data_gap_recovery": 90, "data_gap_recovery_extended": 6, "argument_transformation": 4, "grounded_synthesis": 0, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 80, "data_gap_recovery_stateful": 92, "data_gap_recovery_extended_stateful": 6, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 66}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 26, "sequential_reasoning": 50, "error_recovery": 42, "data_gap_recovery": 45, "data_gap_recovery_extended": 3, "argument_transformation": 2, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 40, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 33}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 104, "sequential_reasoning": 200, "error_recovery": 84, "data_gap_recovery": 225, "data_gap_recovery_extended": 24, "argument_transformation": 10, "grounded_synthesis": 0, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 84, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 120, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 264}, "scenarioActualCalls": {"relevance_detection": 51, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 101, "sequential_3step": 149, "conditional_routing": 141, "sequential_reasoning": 200, "error_recovery": 126, "data_gap_recovery": 225, "data_gap_recovery_extended": 23, "argument_transformation": 8, "grounded_synthesis": 0, "inconsistent_api_recovery": 566, "relevance_detection_stateful": 51, "argument_fidelity_stateful": 150, "tool_selection_stateful": 151, "basic_2step_stateful": 108, "sequential_3step_stateful": 155, "conditional_routing_stateful": 117, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 120, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 21, "inconsistent_api_recovery_stateful": 379}, "scenarioWastedSum": {"relevance_detection": 1.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 1.0, "sequential_3step": 2.0, "conditional_routing": 44.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 0.0, "grounded_synthesis": 2.0, "inconsistent_api_recovery": 166.0, "relevance_detection_stateful": 1.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 1.0, "basic_2step_stateful": 8.0, "sequential_3step_stateful": 5.0, "conditional_routing_stateful": 39.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 7.0, "inconsistent_api_recovery_stateful": 152.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 70.04, "argument_fidelity": 132.41, "tool_selection": 117.18, "basic_2step": 269.63, "sequential_3step": 199.88, "conditional_routing": 470.04, "sequential_reasoning": 278.36, "error_recovery": 270.9, "data_gap_recovery": 588.34, "data_gap_recovery_extended": 910.47, "argument_transformation": 886.85, "grounded_synthesis": 650.18, "inconsistent_api_recovery": 2156.16, "relevance_detection_stateful": 68.15, "argument_fidelity_stateful": 132.9, "tool_selection_stateful": 119.51, "basic_2step_stateful": 365.62, "sequential_3step_stateful": 210.28, "conditional_routing_stateful": 437.49, "sequential_reasoning_stateful": 306.75, "error_recovery_stateful": 290.9, "data_gap_recovery_stateful": 630.65, "data_gap_recovery_extended_stateful": 742.67, "argument_transformation_stateful": 830.77, "grounded_synthesis_stateful": 658.65, "inconsistent_api_recovery_stateful": 2145.56}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite4.1:8b-q8_0 OL/N [reforged]", "model": "granite4.1:8b-q8_0", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "granite-4.1-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 69.2, "accuracy": 69.2, "completeness": 100.0, "efficiency": 83.3, "wasted": 1.1, "speed": 2.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 200, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 250, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 200, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 250, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 50.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 150.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 108.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 150.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 100.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 108.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 150.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 54.2, "argument_fidelity": 73.33, "tool_selection": 87.19, "basic_2step": 47.03, "sequential_3step": 68.13, "conditional_routing": 160.63, "sequential_reasoning": 80.65, "error_recovery": 157.08, "data_gap_recovery": 219.66, "data_gap_recovery_extended": 206.67, "argument_transformation": 181.51, "grounded_synthesis": 311.91, "inconsistent_api_recovery": 264.9, "relevance_detection_stateful": 54.2, "argument_fidelity_stateful": 73.21, "tool_selection_stateful": 87.06, "basic_2step_stateful": 43.04, "sequential_3step_stateful": 68.12, "conditional_routing_stateful": 168.26, "sequential_reasoning_stateful": 80.49, "error_recovery_stateful": 157.14, "data_gap_recovery_stateful": 219.53, "data_gap_recovery_extended_stateful": 206.58, "argument_transformation_stateful": 181.39, "grounded_synthesis_stateful": 311.75, "inconsistent_api_recovery_stateful": 264.85}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.6-27B-Q4_K_M LS/P [bare]", "model": "Qwen3.6-27B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "qwen3.6-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 69.0, "accuracy": 75.5, "completeness": 91.4, "efficiency": 100.0, "wasted": 0.2, "speed": 50.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 54, "grounded_synthesis": 46, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 98, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 92, "conditional_routing_stateful": 4, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 70, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 27, "grounded_synthesis": 23, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 2, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 17, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 135, "grounded_synthesis": 230, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 138, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 175, "grounded_synthesis_stateful": 170, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 179, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 244, "data_gap_recovery_extended": 0, "argument_transformation": 109, "grounded_synthesis": 263, "inconsistent_api_recovery": 257, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 140, "conditional_routing_stateful": 9, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 237, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 145, "grounded_synthesis_stateful": 185, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 1.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 5.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 2.0, "grounded_synthesis": 91.0, "inconsistent_api_recovery": 15.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 2.0, "conditional_routing_stateful": 1.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 5.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 3.0, "grounded_synthesis_stateful": 73.0, "inconsistent_api_recovery_stateful": 16.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 460.81, "argument_fidelity": 557.39, "tool_selection": 426.26, "basic_2step": 516.92, "sequential_3step": 805.49, "conditional_routing": 1188.14, "sequential_reasoning": 946.99, "error_recovery": 0.0, "data_gap_recovery": 2437.53, "data_gap_recovery_extended": 3008.39, "argument_transformation": 5834.95, "grounded_synthesis": 5097.73, "inconsistent_api_recovery": 7928.95, "relevance_detection_stateful": 343.5, "argument_fidelity_stateful": 545.73, "tool_selection_stateful": 435.01, "basic_2step_stateful": 849.63, "sequential_3step_stateful": 1131.85, "conditional_routing_stateful": 1242.6, "sequential_reasoning_stateful": 893.73, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 2500.74, "data_gap_recovery_extended_stateful": 3252.47, "argument_transformation_stateful": 5730.32, "grounded_synthesis_stateful": 5259.62, "inconsistent_api_recovery_stateful": 8617.78}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q4_K_M LS/N [reforged]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 68.2, "accuracy": 68.4, "completeness": 99.6, "efficiency": 85.7, "wasted": 0.7, "speed": 16.1, "n": 50, "scenarios": {"relevance_detection": 98, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 92, "sequential_reasoning": 100, "error_recovery": 48, "data_gap_recovery": 78, "data_gap_recovery_extended": 0, "argument_transformation": 44, "grounded_synthesis": 8, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 90, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 40, "data_gap_recovery_stateful": 76, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 14, "inconsistent_api_recovery_stateful": 38}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 46, "sequential_reasoning": 50, "error_recovery": 24, "data_gap_recovery": 39, "data_gap_recovery_extended": 0, "argument_transformation": 22, "grounded_synthesis": 4, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 20, "data_gap_recovery_stateful": 38, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 7, "inconsistent_api_recovery_stateful": 19}, "scenarioCompleted": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 184, "sequential_reasoning": 200, "error_recovery": 48, "data_gap_recovery": 195, "data_gap_recovery_extended": 0, "argument_transformation": 110, "grounded_synthesis": 40, "inconsistent_api_recovery": 152, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 180, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 60, "data_gap_recovery_stateful": 190, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 70, "inconsistent_api_recovery_stateful": 152}, "scenarioActualCalls": {"relevance_detection": 57, "argument_fidelity": 167, "tool_selection": 205, "basic_2step": 109, "sequential_3step": 168, "conditional_routing": 231, "sequential_reasoning": 213, "error_recovery": 81, "data_gap_recovery": 150, "data_gap_recovery_extended": 0, "argument_transformation": 120, "grounded_synthesis": 17, "inconsistent_api_recovery": 247, "relevance_detection_stateful": 53, "argument_fidelity_stateful": 172, "tool_selection_stateful": 204, "basic_2step_stateful": 113, "sequential_3step_stateful": 170, "conditional_routing_stateful": 231, "sequential_reasoning_stateful": 217, "error_recovery_stateful": 69, "data_gap_recovery_stateful": 159, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 38, "grounded_synthesis_stateful": 63, "inconsistent_api_recovery_stateful": 246}, "scenarioWastedSum": {"relevance_detection": 8.0, "argument_fidelity": 17.0, "tool_selection": 55.0, "basic_2step": 9.0, "sequential_3step": 18.0, "conditional_routing": 47.0, "sequential_reasoning": 13.0, "error_recovery": 84.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 12.0, "argument_transformation": 55.0, "grounded_synthesis": 7.0, "inconsistent_api_recovery": 96.0, "relevance_detection_stateful": 3.0, "argument_fidelity_stateful": 22.0, "tool_selection_stateful": 54.0, "basic_2step_stateful": 13.0, "sequential_3step_stateful": 20.0, "conditional_routing_stateful": 52.0, "sequential_reasoning_stateful": 17.0, "error_recovery_stateful": 38.0, "data_gap_recovery_stateful": 12.0, "data_gap_recovery_extended_stateful": 8.0, "argument_transformation_stateful": 71.0, "grounded_synthesis_stateful": 32.0, "inconsistent_api_recovery_stateful": 99.0}, "scenarioWastedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 64.12, "argument_fidelity": 369.32, "tool_selection": 339.75, "basic_2step": 150.68, "sequential_3step": 489.89, "conditional_routing": 942.85, "sequential_reasoning": 598.23, "error_recovery": 517.12, "data_gap_recovery": 684.51, "data_gap_recovery_extended": 967.69, "argument_transformation": 1904.48, "grounded_synthesis": 1422.26, "inconsistent_api_recovery": 1900.31, "relevance_detection_stateful": 59.9, "argument_fidelity_stateful": 404.54, "tool_selection_stateful": 363.54, "basic_2step_stateful": 177.85, "sequential_3step_stateful": 525.48, "conditional_routing_stateful": 959.14, "sequential_reasoning_stateful": 627.65, "error_recovery_stateful": 565.93, "data_gap_recovery_stateful": 696.09, "data_gap_recovery_extended_stateful": 918.51, "argument_transformation_stateful": 1767.16, "grounded_synthesis_stateful": 1525.07, "inconsistent_api_recovery_stateful": 1893.18}, "scenarioSpeedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-14B-Q4_K_M LS/N [reforged]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 67.7, "accuracy": 67.7, "completeness": 99.9, "efficiency": 85.1, "wasted": 0.9, "speed": 20.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 100, "error_recovery": 62, "data_gap_recovery": 36, "data_gap_recovery_extended": 4, "argument_transformation": 22, "grounded_synthesis": 44, "inconsistent_api_recovery": 22, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 98, "conditional_routing_stateful": 84, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 66, "data_gap_recovery_stateful": 24, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 18, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 32}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 31, "data_gap_recovery": 18, "data_gap_recovery_extended": 2, "argument_transformation": 11, "grounded_synthesis": 22, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 33, "data_gap_recovery_stateful": 12, "data_gap_recovery_extended_stateful": 6, "argument_transformation_stateful": 9, "grounded_synthesis_stateful": 21, "inconsistent_api_recovery_stateful": 16}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 200, "error_recovery": 62, "data_gap_recovery": 90, "data_gap_recovery_extended": 16, "argument_transformation": 55, "grounded_synthesis": 220, "inconsistent_api_recovery": 88, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 168, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 99, "data_gap_recovery_stateful": 60, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 210, "inconsistent_api_recovery_stateful": 128}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 220, "tool_selection": 213, "basic_2step": 165, "sequential_3step": 185, "conditional_routing": 196, "sequential_reasoning": 252, "error_recovery": 129, "data_gap_recovery": 69, "data_gap_recovery_extended": 6, "argument_transformation": 88, "grounded_synthesis": 103, "inconsistent_api_recovery": 114, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 222, "tool_selection_stateful": 215, "basic_2step_stateful": 161, "sequential_3step_stateful": 173, "conditional_routing_stateful": 187, "sequential_reasoning_stateful": 249, "error_recovery_stateful": 142, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 28, "argument_transformation_stateful": 52, "grounded_synthesis_stateful": 104, "inconsistent_api_recovery_stateful": 195}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 70.0, "tool_selection": 63.0, "basic_2step": 65.0, "sequential_3step": 35.0, "conditional_routing": 49.0, "sequential_reasoning": 52.0, "error_recovery": 117.0, "data_gap_recovery": 14.0, "data_gap_recovery_extended": 2.0, "argument_transformation": 105.0, "grounded_synthesis": 8.0, "inconsistent_api_recovery": 51.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 72.0, "tool_selection_stateful": 65.0, "basic_2step_stateful": 61.0, "sequential_3step_stateful": 27.0, "conditional_routing_stateful": 53.0, "sequential_reasoning_stateful": 49.0, "error_recovery_stateful": 67.0, "data_gap_recovery_stateful": 15.0, "data_gap_recovery_extended_stateful": 5.0, "argument_transformation_stateful": 82.0, "grounded_synthesis_stateful": 18.0, "inconsistent_api_recovery_stateful": 76.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 91.9, "argument_fidelity": 555.8, "tool_selection": 484.8, "basic_2step": 378.72, "sequential_3step": 748.88, "conditional_routing": 1017.18, "sequential_reasoning": 981.52, "error_recovery": 547.79, "data_gap_recovery": 802.15, "data_gap_recovery_extended": 1050.47, "argument_transformation": 2879.3, "grounded_synthesis": 1878.42, "inconsistent_api_recovery": 1975.55, "relevance_detection_stateful": 97.42, "argument_fidelity_stateful": 526.23, "tool_selection_stateful": 490.49, "basic_2step_stateful": 351.83, "sequential_3step_stateful": 665.69, "conditional_routing_stateful": 1084.18, "sequential_reasoning_stateful": 1006.83, "error_recovery_stateful": 548.28, "data_gap_recovery_stateful": 826.86, "data_gap_recovery_extended_stateful": 998.64, "argument_transformation_stateful": 2811.88, "grounded_synthesis_stateful": 1889.06, "inconsistent_api_recovery_stateful": 2273.76}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "qwen3:8b-q8_0 OL/N [reforged]", "model": "qwen3:8b-q8_0", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "qwen3-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 67.5, "accuracy": 67.6, "completeness": 99.9, "efficiency": 85.1, "wasted": 0.6, "speed": 31.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 26, "data_gap_recovery": 88, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 6, "inconsistent_api_recovery": 66, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 40, "data_gap_recovery_stateful": 82, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 44}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 13, "data_gap_recovery": 44, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 3, "inconsistent_api_recovery": 33, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 20, "data_gap_recovery_stateful": 41, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 22}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 26, "data_gap_recovery": 220, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 30, "inconsistent_api_recovery": 264, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 60, "data_gap_recovery_stateful": 205, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 176}, "scenarioActualCalls": {"relevance_detection": 56, "argument_fidelity": 150, "tool_selection": 202, "basic_2step": 133, "sequential_3step": 155, "conditional_routing": 252, "sequential_reasoning": 200, "error_recovery": 39, "data_gap_recovery": 229, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 26, "inconsistent_api_recovery": 379, "relevance_detection_stateful": 57, "argument_fidelity_stateful": 150, "tool_selection_stateful": 201, "basic_2step_stateful": 125, "sequential_3step_stateful": 154, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 62, "data_gap_recovery_stateful": 221, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 264}, "scenarioWastedSum": {"relevance_detection": 6.0, "argument_fidelity": 0.0, "tool_selection": 52.0, "basic_2step": 33.0, "sequential_3step": 5.0, "conditional_routing": 52.0, "sequential_reasoning": 0.0, "error_recovery": 88.0, "data_gap_recovery": 14.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 3.0, "grounded_synthesis": 1.0, "inconsistent_api_recovery": 170.0, "relevance_detection_stateful": 7.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 51.0, "basic_2step_stateful": 25.0, "sequential_3step_stateful": 4.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 33.0, "data_gap_recovery_stateful": 24.0, "data_gap_recovery_extended_stateful": 3.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 3.0, "inconsistent_api_recovery_stateful": 183.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 160.09, "argument_fidelity": 379.27, "tool_selection": 722.42, "basic_2step": 454.24, "sequential_3step": 674.39, "conditional_routing": 1498.29, "sequential_reasoning": 507.33, "error_recovery": 724.42, "data_gap_recovery": 1028.09, "data_gap_recovery_extended": 1690.29, "argument_transformation": 3008.72, "grounded_synthesis": 2852.96, "inconsistent_api_recovery": 6197.34, "relevance_detection_stateful": 161.17, "argument_fidelity_stateful": 420.92, "tool_selection_stateful": 749.62, "basic_2step_stateful": 491.27, "sequential_3step_stateful": 744.41, "conditional_routing_stateful": 1621.71, "sequential_reasoning_stateful": 586.99, "error_recovery_stateful": 714.69, "data_gap_recovery_stateful": 1199.1, "data_gap_recovery_extended_stateful": 1843.54, "argument_transformation_stateful": 3016.37, "grounded_synthesis_stateful": 2514.35, "inconsistent_api_recovery_stateful": 6354.36}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-Nemo-Instruct-2407-Q4_K_M LS/P [reforged]", "model": "Mistral-Nemo-Instruct-2407-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 68.2, "accuracy": 75.6, "completeness": 90.2, "efficiency": 93.9, "wasted": 1.0, "speed": 3.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 0, "conditional_routing": 100, "sequential_reasoning": 64, "error_recovery": 100, "data_gap_recovery": 90, "data_gap_recovery_extended": 80, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 60, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 94, "data_gap_recovery_extended_stateful": 70, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 32, "error_recovery": 50, "data_gap_recovery": 45, "data_gap_recovery_extended": 40, "argument_transformation": 0, "grounded_synthesis": 25, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 30, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 35, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 36, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 38, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 36, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 38, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 0, "conditional_routing": 200, "sequential_reasoning": 128, "error_recovery": 100, "data_gap_recovery": 225, "data_gap_recovery_extended": 320, "argument_transformation": 0, "grounded_synthesis": 250, "inconsistent_api_recovery": 64, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 120, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 280, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 240, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 158, "tool_selection": 171, "basic_2step": 100, "sequential_3step": 0, "conditional_routing": 250, "sequential_reasoning": 351, "error_recovery": 150, "data_gap_recovery": 194, "data_gap_recovery_extended": 140, "argument_transformation": 0, "grounded_synthesis": 227, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 157, "tool_selection_stateful": 168, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 248, "sequential_reasoning_stateful": 320, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 187, "data_gap_recovery_extended_stateful": 128, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 238, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 8.0, "tool_selection": 21.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 241.0, "error_recovery": 50.0, "data_gap_recovery": 21.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 7.0, "grounded_synthesis": 112.0, "inconsistent_api_recovery": 97.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 7.0, "tool_selection_stateful": 18.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 48.0, "sequential_reasoning_stateful": 243.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 6.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 15.0, "grounded_synthesis_stateful": 113.0, "inconsistent_api_recovery_stateful": 86.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 36, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 38, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 18.23, "argument_fidelity": 48.16, "tool_selection": 65.1, "basic_2step": 30.51, "sequential_3step": 0.0, "conditional_routing": 128.95, "sequential_reasoning": 124.57, "error_recovery": 34.1, "data_gap_recovery": 155.55, "data_gap_recovery_extended": 202.16, "argument_transformation": 205.06, "grounded_synthesis": 491.91, "inconsistent_api_recovery": 385.08, "relevance_detection_stateful": 18.95, "argument_fidelity_stateful": 47.74, "tool_selection_stateful": 65.76, "basic_2step_stateful": 33.48, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 128.08, "sequential_reasoning_stateful": 127.05, "error_recovery_stateful": 34.09, "data_gap_recovery_stateful": 143.22, "data_gap_recovery_extended_stateful": 194.04, "argument_transformation_stateful": 213.1, "grounded_synthesis_stateful": 509.16, "inconsistent_api_recovery_stateful": 377.19}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 36, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 38, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/N [bare]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 67.8, "accuracy": 77.0, "completeness": 88.1, "efficiency": 100.0, "wasted": 0.2, "speed": 9.4, "n": 50, "scenarios": {"relevance_detection": 96, "argument_fidelity": 100, "tool_selection": 82, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 86, "error_recovery": 0, "data_gap_recovery": 94, "data_gap_recovery_extended": 0, "argument_transformation": 18, "grounded_synthesis": 80, "inconsistent_api_recovery": 78, "relevance_detection_stateful": 96, "argument_fidelity_stateful": 100, "tool_selection_stateful": 86, "basic_2step_stateful": 100, "sequential_3step_stateful": 96, "conditional_routing_stateful": 92, "sequential_reasoning_stateful": 88, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 94, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 82, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 48, "argument_fidelity": 50, "tool_selection": 41, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 43, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 0, "argument_transformation": 9, "grounded_synthesis": 40, "inconsistent_api_recovery": 39, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 44, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 48, "argument_fidelity": 50, "tool_selection": 41, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 45, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 47}, "scenarioValidated": {"relevance_detection": 48, "argument_fidelity": 50, "tool_selection": 41, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 45, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 47}, "scenarioIdealCalls": {"relevance_detection": 48, "argument_fidelity": 150, "tool_selection": 123, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 172, "error_recovery": 0, "data_gap_recovery": 235, "data_gap_recovery_extended": 0, "argument_transformation": 45, "grounded_synthesis": 400, "inconsistent_api_recovery": 312, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 150, "tool_selection_stateful": 129, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 184, "sequential_reasoning_stateful": 176, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 410, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 48, "argument_fidelity": 154, "tool_selection": 127, "basic_2step": 100, "sequential_3step": 161, "conditional_routing": 199, "sequential_reasoning": 182, "error_recovery": 0, "data_gap_recovery": 231, "data_gap_recovery_extended": 0, "argument_transformation": 36, "grounded_synthesis": 169, "inconsistent_api_recovery": 272, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 156, "tool_selection_stateful": 135, "basic_2step_stateful": 102, "sequential_3step_stateful": 151, "conditional_routing_stateful": 191, "sequential_reasoning_stateful": 185, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 199, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 4.0, "tool_selection": 4.0, "basic_2step": 0.0, "sequential_3step": 11.0, "conditional_routing": 20.0, "sequential_reasoning": 10.0, "error_recovery": 0.0, "data_gap_recovery": 26.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 2.0, "grounded_synthesis": 9.0, "inconsistent_api_recovery": 19.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 6.0, "tool_selection_stateful": 6.0, "basic_2step_stateful": 2.0, "sequential_3step_stateful": 7.0, "conditional_routing_stateful": 19.0, "sequential_reasoning_stateful": 9.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 17.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 26.0, "inconsistent_api_recovery_stateful": 4.0}, "scenarioWastedN": {"relevance_detection": 48, "argument_fidelity": 50, "tool_selection": 41, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 45, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 47}, "scenarioSpeedSum": {"relevance_detection": 77.7, "argument_fidelity": 159.91, "tool_selection": 115.91, "basic_2step": 71.35, "sequential_3step": 154.92, "conditional_routing": 471.24, "sequential_reasoning": 258.16, "error_recovery": 0.0, "data_gap_recovery": 676.05, "data_gap_recovery_extended": 682.7, "argument_transformation": 865.52, "grounded_synthesis": 723.71, "inconsistent_api_recovery": 1162.45, "relevance_detection_stateful": 72.33, "argument_fidelity_stateful": 144.22, "tool_selection_stateful": 117.26, "basic_2step_stateful": 76.18, "sequential_3step_stateful": 149.45, "conditional_routing_stateful": 467.42, "sequential_reasoning_stateful": 301.96, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 661.1, "data_gap_recovery_extended_stateful": 644.33, "argument_transformation_stateful": 813.67, "grounded_synthesis_stateful": 759.14, "inconsistent_api_recovery_stateful": 1152.18}, "scenarioSpeedN": {"relevance_detection": 48, "argument_fidelity": 50, "tool_selection": 41, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 45, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 47}}, {"label": "gemma-4-E4B-it-Q8_0 LS/N [bare]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "gemma4-e4b", "quant": "q8_0", "gen": 2, "retired": false, "score": 67.7, "accuracy": 76.9, "completeness": 88.1, "efficiency": 100.0, "wasted": 0.3, "speed": 13.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 78, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 92, "sequential_reasoning": 90, "error_recovery": 0, "data_gap_recovery": 98, "data_gap_recovery_extended": 4, "argument_transformation": 10, "grounded_synthesis": 84, "inconsistent_api_recovery": 92, "relevance_detection_stateful": 98, "argument_fidelity_stateful": 94, "tool_selection_stateful": 84, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 84, "sequential_reasoning_stateful": 88, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 84, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 78, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 46, "sequential_reasoning": 45, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 2, "argument_transformation": 5, "grounded_synthesis": 42, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 47, "tool_selection_stateful": 42, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 44, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 47, "tool_selection_stateful": 42, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 47, "tool_selection_stateful": 42, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 117, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 184, "sequential_reasoning": 180, "error_recovery": 0, "data_gap_recovery": 245, "data_gap_recovery_extended": 16, "argument_transformation": 25, "grounded_synthesis": 420, "inconsistent_api_recovery": 368, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 141, "tool_selection_stateful": 126, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 168, "sequential_reasoning_stateful": 176, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 210, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 390, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 156, "tool_selection": 120, "basic_2step": 100, "sequential_3step": 158, "conditional_routing": 213, "sequential_reasoning": 194, "error_recovery": 0, "data_gap_recovery": 257, "data_gap_recovery_extended": 11, "argument_transformation": 23, "grounded_synthesis": 174, "inconsistent_api_recovery": 338, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 150, "tool_selection_stateful": 128, "basic_2step_stateful": 101, "sequential_3step_stateful": 169, "conditional_routing_stateful": 180, "sequential_reasoning_stateful": 185, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 215, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 204, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 6.0, "tool_selection": 3.0, "basic_2step": 0.0, "sequential_3step": 11.0, "conditional_routing": 34.0, "sequential_reasoning": 14.0, "error_recovery": 0.0, "data_gap_recovery": 36.0, "data_gap_recovery_extended": 18.0, "argument_transformation": 4.0, "grounded_synthesis": 8.0, "inconsistent_api_recovery": 22.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 9.0, "tool_selection_stateful": 2.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 19.0, "conditional_routing_stateful": 22.0, "sequential_reasoning_stateful": 11.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 24.0, "data_gap_recovery_extended_stateful": 12.0, "argument_transformation_stateful": 2.0, "grounded_synthesis_stateful": 43.0, "inconsistent_api_recovery_stateful": 21.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 47, "tool_selection_stateful": 42, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 110.27, "argument_fidelity": 229.08, "tool_selection": 162.25, "basic_2step": 90.13, "sequential_3step": 246.7, "conditional_routing": 675.64, "sequential_reasoning": 394.03, "error_recovery": 0.0, "data_gap_recovery": 1037.47, "data_gap_recovery_extended": 1002.48, "argument_transformation": 1120.9, "grounded_synthesis": 964.04, "inconsistent_api_recovery": 1763.03, "relevance_detection_stateful": 106.93, "argument_fidelity_stateful": 230.45, "tool_selection_stateful": 166.66, "basic_2step_stateful": 104.96, "sequential_3step_stateful": 260.39, "conditional_routing_stateful": 679.65, "sequential_reasoning_stateful": 417.91, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 894.99, "data_gap_recovery_extended_stateful": 1069.06, "argument_transformation_stateful": 1115.04, "grounded_synthesis_stateful": 1174.8, "inconsistent_api_recovery_stateful": 1797.97}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 47, "tool_selection_stateful": 42, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 49}}, {"label": "ministral-3:8b-instruct-2512-q4_K_M OL/N [reforged]", "model": "ministral-3:8b-instruct-2512-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "ministral-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 66.8, "accuracy": 71.9, "completeness": 92.9, "efficiency": 67.7, "wasted": 1.4, "speed": 5.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 76, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 68, "data_gap_recovery": 90, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 4, "inconsistent_api_recovery": 64, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 28, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 80, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 22}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 34, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 14, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 40, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 11}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 14, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 41, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 14, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 41, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 114, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 68, "data_gap_recovery": 225, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 20, "inconsistent_api_recovery": 256, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 42, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 120, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 88}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 200, "tool_selection": 200, "basic_2step": 150, "sequential_3step": 118, "conditional_routing": 258, "sequential_reasoning": 250, "error_recovery": 194, "data_gap_recovery": 315, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 18, "inconsistent_api_recovery": 504, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 200, "tool_selection_stateful": 200, "basic_2step_stateful": 150, "sequential_3step_stateful": 42, "conditional_routing_stateful": 260, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 252, "data_gap_recovery_stateful": 344, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 165}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 50.0, "tool_selection": 50.0, "basic_2step": 50.0, "sequential_3step": 4.0, "conditional_routing": 58.0, "sequential_reasoning": 50.0, "error_recovery": 151.0, "data_gap_recovery": 90.0, "data_gap_recovery_extended": 33.0, "argument_transformation": 18.0, "grounded_synthesis": 2.0, "inconsistent_api_recovery": 248.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 50.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 60.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 137.0, "data_gap_recovery_stateful": 101.0, "data_gap_recovery_extended_stateful": 18.0, "argument_transformation_stateful": 25.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 241.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 14, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 41, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioSpeedSum": {"relevance_detection": 32.52, "argument_fidelity": 79.72, "tool_selection": 64.14, "basic_2step": 37.52, "sequential_3step": 55.8, "conditional_routing": 290.91, "sequential_reasoning": 137.39, "error_recovery": 234.75, "data_gap_recovery": 356.68, "data_gap_recovery_extended": 581.33, "argument_transformation": 345.16, "grounded_synthesis": 473.91, "inconsistent_api_recovery": 447.82, "relevance_detection_stateful": 32.54, "argument_fidelity_stateful": 80.5, "tool_selection_stateful": 64.1, "basic_2step_stateful": 32.0, "sequential_3step_stateful": 16.24, "conditional_routing_stateful": 245.28, "sequential_reasoning_stateful": 138.62, "error_recovery_stateful": 310.54, "data_gap_recovery_stateful": 357.36, "data_gap_recovery_extended_stateful": 505.19, "argument_transformation_stateful": 372.93, "grounded_synthesis_stateful": 563.91, "inconsistent_api_recovery_stateful": 649.25}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 14, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 41, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}}, {"label": "Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [bare]", "model": "Ministral-3-14B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "ministral-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 66.5, "accuracy": 80.0, "completeness": 83.1, "efficiency": 100.0, "wasted": 0.0, "speed": 2.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 96, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 32}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 16}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 40, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 40, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 240, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 128}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 160, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 166, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 160, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 144, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 49}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 5.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 5.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 40, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 16.58, "argument_fidelity": 59.2, "tool_selection": 52.17, "basic_2step": 24.06, "sequential_3step": 67.76, "conditional_routing": 182.6, "sequential_reasoning": 111.02, "error_recovery": 0.0, "data_gap_recovery": 149.79, "data_gap_recovery_extended": 171.01, "argument_transformation": 164.24, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 278.15, "relevance_detection_stateful": 16.58, "argument_fidelity_stateful": 59.3, "tool_selection_stateful": 52.13, "basic_2step_stateful": 24.07, "sequential_3step_stateful": 69.97, "conditional_routing_stateful": 177.16, "sequential_reasoning_stateful": 105.87, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 149.71, "data_gap_recovery_extended_stateful": 178.43, "argument_transformation_stateful": 172.31, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 272.79}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 40, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [bare]", "model": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "mistral-small-3.2", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 65.7, "accuracy": 81.1, "completeness": 81.0, "efficiency": 79.7, "wasted": 0.9, "speed": 3.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 96, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 32, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 100, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 98, "basic_2step_stateful": 100, "sequential_3step_stateful": 98, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 16, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 23, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 20, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 50, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 27}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 50, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 27}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 144, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 80, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 500, "inconsistent_api_recovery": 184, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 144, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 80, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 900, "inconsistent_api_recovery": 151, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 101, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 887, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 32.0, "grounded_synthesis": 400.0, "inconsistent_api_recovery": 13.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 18.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 23.0, "grounded_synthesis_stateful": 392.0, "inconsistent_api_recovery_stateful": 7.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 50, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 27}, "scenarioSpeedSum": {"relevance_detection": 31.16, "argument_fidelity": 100.2, "tool_selection": 69.3, "basic_2step": 42.55, "sequential_3step": 102.09, "conditional_routing": 196.04, "sequential_reasoning": 112.68, "error_recovery": 0.0, "data_gap_recovery": 186.43, "data_gap_recovery_extended": 201.34, "argument_transformation": 31.59, "grounded_synthesis": 528.4, "inconsistent_api_recovery": 169.87, "relevance_detection_stateful": 31.21, "argument_fidelity_stateful": 98.75, "tool_selection_stateful": 70.87, "basic_2step_stateful": 62.17, "sequential_3step_stateful": 101.54, "conditional_routing_stateful": 198.61, "sequential_reasoning_stateful": 114.02, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 194.57, "data_gap_recovery_extended_stateful": 202.49, "argument_transformation_stateful": 22.81, "grounded_synthesis_stateful": 528.87, "inconsistent_api_recovery_stateful": 191.34}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 50, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 27}}, {"label": "granite-4.1-8b-Q8_0 LS/N [reforged]", "model": "granite-4.1-8b-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "granite-4.1-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 65.4, "accuracy": 65.4, "completeness": 100.0, "efficiency": 88.2, "wasted": 1.4, "speed": 2.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 199, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 200, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 49.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 100.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 397.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 150.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 397.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 150.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 31.75, "argument_fidelity": 63.65, "tool_selection": 62.49, "basic_2step": 34.65, "sequential_3step": 52.65, "conditional_routing": 185.2, "sequential_reasoning": 80.99, "error_recovery": 53.67, "data_gap_recovery": 200.37, "data_gap_recovery_extended": 250.34, "argument_transformation": 251.15, "grounded_synthesis": 295.04, "inconsistent_api_recovery": 245.29, "relevance_detection_stateful": 31.6, "argument_fidelity_stateful": 64.45, "tool_selection_stateful": 51.89, "basic_2step_stateful": 38.08, "sequential_3step_stateful": 52.6, "conditional_routing_stateful": 202.85, "sequential_reasoning_stateful": 80.91, "error_recovery_stateful": 54.16, "data_gap_recovery_stateful": 200.31, "data_gap_recovery_extended_stateful": 246.17, "argument_transformation_stateful": 250.13, "grounded_synthesis_stateful": 293.89, "inconsistent_api_recovery_stateful": 244.19}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "qwen3:8b-q4_K_M OL/N [reforged]", "model": "qwen3:8b-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 64.9, "accuracy": 65.1, "completeness": 99.8, "efficiency": 84.7, "wasted": 0.6, "speed": 21.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 98, "error_recovery": 30, "data_gap_recovery": 62, "data_gap_recovery_extended": 2, "argument_transformation": 6, "grounded_synthesis": 2, "inconsistent_api_recovery": 74, "relevance_detection_stateful": 98, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 26, "data_gap_recovery_stateful": 70, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 18}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 15, "data_gap_recovery": 31, "data_gap_recovery_extended": 1, "argument_transformation": 3, "grounded_synthesis": 1, "inconsistent_api_recovery": 37, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 13, "data_gap_recovery_stateful": 35, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 9}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 196, "error_recovery": 30, "data_gap_recovery": 155, "data_gap_recovery_extended": 8, "argument_transformation": 15, "grounded_synthesis": 10, "inconsistent_api_recovery": 296, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 39, "data_gap_recovery_stateful": 175, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 72}, "scenarioActualCalls": {"relevance_detection": 70, "argument_fidelity": 150, "tool_selection": 200, "basic_2step": 187, "sequential_3step": 151, "conditional_routing": 249, "sequential_reasoning": 196, "error_recovery": 47, "data_gap_recovery": 164, "data_gap_recovery_extended": 8, "argument_transformation": 12, "grounded_synthesis": 6, "inconsistent_api_recovery": 354, "relevance_detection_stateful": 64, "argument_fidelity_stateful": 150, "tool_selection_stateful": 203, "basic_2step_stateful": 137, "sequential_3step_stateful": 150, "conditional_routing_stateful": 251, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 192, "data_gap_recovery_extended_stateful": 15, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 18, "inconsistent_api_recovery_stateful": 111}, "scenarioWastedSum": {"relevance_detection": 20.0, "argument_fidelity": 0.0, "tool_selection": 50.0, "basic_2step": 87.0, "sequential_3step": 1.0, "conditional_routing": 57.0, "sequential_reasoning": 0.0, "error_recovery": 86.0, "data_gap_recovery": 27.0, "data_gap_recovery_extended": 3.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 90.0, "relevance_detection_stateful": 15.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 53.0, "basic_2step_stateful": 37.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 55.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 42.0, "data_gap_recovery_stateful": 28.0, "data_gap_recovery_extended_stateful": 12.0, "argument_transformation_stateful": 2.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 107.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 177.68, "argument_fidelity": 313.89, "tool_selection": 515.08, "basic_2step": 346.04, "sequential_3step": 404.54, "conditional_routing": 1074.84, "sequential_reasoning": 389.12, "error_recovery": 485.21, "data_gap_recovery": 841.17, "data_gap_recovery_extended": 1241.74, "argument_transformation": 2480.48, "grounded_synthesis": 1647.12, "inconsistent_api_recovery": 3610.82, "relevance_detection_stateful": 190.1, "argument_fidelity_stateful": 297.64, "tool_selection_stateful": 533.84, "basic_2step_stateful": 248.29, "sequential_3step_stateful": 472.71, "conditional_routing_stateful": 1047.48, "sequential_reasoning_stateful": 413.98, "error_recovery_stateful": 459.71, "data_gap_recovery_stateful": 905.37, "data_gap_recovery_extended_stateful": 1283.58, "argument_transformation_stateful": 2440.56, "grounded_synthesis_stateful": 1741.41, "inconsistent_api_recovery_stateful": 3714.22}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-Nemo-Instruct-2407.Q4_K_M LF/P [reforged]", "model": "Mistral-Nemo-Instruct-2407.Q4_K_M", "backend": "llamafile", "mode": "prompt", "ablation": "reforged", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 64.8, "accuracy": 66.6, "completeness": 97.3, "efficiency": 100.0, "wasted": 0.4, "speed": 4.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 96, "error_recovery": 92, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 82, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 50, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 88, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 68, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 48, "error_recovery": 46, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 41, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 25, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 44, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 31}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 31}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 192, "error_recovery": 92, "data_gap_recovery": 10, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 410, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 75, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 132, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 340, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 121, "basic_2step": 101, "sequential_3step": 150, "conditional_routing": 247, "sequential_reasoning": 189, "error_recovery": 146, "data_gap_recovery": 8, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 256, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 75, "basic_2step_stateful": 103, "sequential_3step_stateful": 150, "conditional_routing_stateful": 248, "sequential_reasoning_stateful": 197, "error_recovery_stateful": 136, "data_gap_recovery_stateful": 13, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 216, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 1.0, "sequential_3step": 0.0, "conditional_routing": 62.0, "sequential_reasoning": 1.0, "error_recovery": 60.0, "data_gap_recovery": 47.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 141.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 3.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 61.0, "sequential_reasoning_stateful": 3.0, "error_recovery_stateful": 7.0, "data_gap_recovery_stateful": 52.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 121.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 31}, "scenarioSpeedSum": {"relevance_detection": 37.18, "argument_fidelity": 99.91, "tool_selection": 83.36, "basic_2step": 66.41, "sequential_3step": 95.75, "conditional_routing": 279.02, "sequential_reasoning": 151.46, "error_recovery": 102.95, "data_gap_recovery": 311.3, "data_gap_recovery_extended": 415.25, "argument_transformation": 335.33, "grounded_synthesis": 657.15, "inconsistent_api_recovery": 396.82, "relevance_detection_stateful": 37.8, "argument_fidelity_stateful": 99.85, "tool_selection_stateful": 81.45, "basic_2step_stateful": 75.02, "sequential_3step_stateful": 95.86, "conditional_routing_stateful": 307.41, "sequential_reasoning_stateful": 153.92, "error_recovery_stateful": 94.02, "data_gap_recovery_stateful": 337.48, "data_gap_recovery_extended_stateful": 414.34, "argument_transformation_stateful": 327.11, "grounded_synthesis_stateful": 676.49, "inconsistent_api_recovery_stateful": 289.79}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 31}}, {"label": "granite-4.1-8b-Q4_K_M LS/N [reforged]", "model": "granite-4.1-8b-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 65.4, "accuracy": 68.0, "completeness": 96.2, "efficiency": 89.7, "wasted": 0.8, "speed": 1.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 200, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 100.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 150.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 400.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 150.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 21.12, "argument_fidelity": 41.11, "tool_selection": 33.67, "basic_2step": 22.09, "sequential_3step": 34.12, "conditional_routing": 263.87, "sequential_reasoning": 52.56, "error_recovery": 35.1, "data_gap_recovery": 129.16, "data_gap_recovery_extended": 167.27, "argument_transformation": 0.0, "grounded_synthesis": 159.4, "inconsistent_api_recovery": 157.39, "relevance_detection_stateful": 22.27, "argument_fidelity_stateful": 41.9, "tool_selection_stateful": 33.87, "basic_2step_stateful": 24.66, "sequential_3step_stateful": 34.04, "conditional_routing_stateful": 96.17, "sequential_reasoning_stateful": 58.39, "error_recovery_stateful": 35.55, "data_gap_recovery_stateful": 133.19, "data_gap_recovery_extended_stateful": 173.6, "argument_transformation_stateful": 169.24, "grounded_synthesis_stateful": 146.77, "inconsistent_api_recovery_stateful": 164.61}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.0-h-micro-Q4_K_M LS/N [reforged]", "model": "granite-4.0-h-micro-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "granite-4.0-h-micro", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 65.3, "accuracy": 70.8, "completeness": 92.3, "efficiency": 89.9, "wasted": 0.4, "speed": 2.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 100, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 500, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 250, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 200, "data_gap_recovery": 200, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 454, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 250, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 200, "data_gap_recovery_stateful": 196, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 100.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 100.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 2.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 100.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 25.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 39.51, "argument_fidelity": 75.28, "tool_selection": 109.15, "basic_2step": 44.05, "sequential_3step": 68.53, "conditional_routing": 219.79, "sequential_reasoning": 90.97, "error_recovery": 73.56, "data_gap_recovery": 157.72, "data_gap_recovery_extended": 218.88, "argument_transformation": 134.18, "grounded_synthesis": 271.78, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 39.37, "argument_fidelity_stateful": 74.85, "tool_selection_stateful": 108.9, "basic_2step_stateful": 51.7, "sequential_3step_stateful": 68.39, "conditional_routing_stateful": 151.95, "sequential_reasoning_stateful": 228.81, "error_recovery_stateful": 73.11, "data_gap_recovery_stateful": 163.44, "data_gap_recovery_extended_stateful": 220.78, "argument_transformation_stateful": 134.69, "grounded_synthesis_stateful": 303.06, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}}, {"label": "Qwen3.6-35B-A3B-UD-Q4_K_M LS/P [bare]", "model": "Qwen3.6-35B-A3B-UD-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "qwen3.6-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 65.5, "accuracy": 75.6, "completeness": 86.6, "efficiency": 100.0, "wasted": 0.1, "speed": 23.0, "n": 50, "scenarios": {"relevance_detection": 96, "argument_fidelity": 100, "tool_selection": 32, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 96, "error_recovery": 0, "data_gap_recovery": 94, "data_gap_recovery_extended": 16, "argument_transformation": 52, "grounded_synthesis": 46, "inconsistent_api_recovery": 98, "relevance_detection_stateful": 88, "argument_fidelity_stateful": 100, "tool_selection_stateful": 28, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 60, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 48, "argument_fidelity": 50, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 8, "argument_transformation": 26, "grounded_synthesis": 23, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 50, "tool_selection_stateful": 14, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 30, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 25, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 14, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 14, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 48, "argument_fidelity": 150, "tool_selection": 48, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 235, "data_gap_recovery_extended": 64, "argument_transformation": 130, "grounded_synthesis": 230, "inconsistent_api_recovery": 392, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 150, "tool_selection_stateful": 42, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 120, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 125, "grounded_synthesis_stateful": 250, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 48, "argument_fidelity": 150, "tool_selection": 48, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 151, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 190, "data_gap_recovery_extended": 27, "argument_transformation": 96, "grounded_synthesis": 151, "inconsistent_api_recovery": 323, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 150, "tool_selection_stateful": 42, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 88, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 197, "data_gap_recovery_extended_stateful": 18, "argument_transformation_stateful": 90, "grounded_synthesis_stateful": 134, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 5.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 6.0, "grounded_synthesis": 2.0, "inconsistent_api_recovery": 28.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 4.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 2.0, "grounded_synthesis_stateful": 4.0, "inconsistent_api_recovery_stateful": 24.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 14, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 347.79, "argument_fidelity": 296.49, "tool_selection": 68.71, "basic_2step": 426.66, "sequential_3step": 472.57, "conditional_routing": 609.54, "sequential_reasoning": 541.01, "error_recovery": 0.0, "data_gap_recovery": 914.84, "data_gap_recovery_extended": 1059.1, "argument_transformation": 2353.43, "grounded_synthesis": 2576.22, "inconsistent_api_recovery": 3609.53, "relevance_detection_stateful": 350.41, "argument_fidelity_stateful": 278.55, "tool_selection_stateful": 53.36, "basic_2step_stateful": 504.91, "sequential_3step_stateful": 524.72, "conditional_routing_stateful": 543.93, "sequential_reasoning_stateful": 759.5, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 998.17, "data_gap_recovery_extended_stateful": 1020.58, "argument_transformation_stateful": 2034.93, "grounded_synthesis_stateful": 2568.03, "inconsistent_api_recovery_stateful": 2948.49}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 14, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [bare]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "ministral-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 65.2, "accuracy": 78.1, "completeness": 83.5, "efficiency": 100.0, "wasted": 0.3, "speed": 3.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 98, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 74, "sequential_reasoning": 92, "error_recovery": 0, "data_gap_recovery": 62, "data_gap_recovery_extended": 30, "argument_transformation": 0, "grounded_synthesis": 56, "inconsistent_api_recovery": 70, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 96, "conditional_routing_stateful": 78, "sequential_reasoning_stateful": 96, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 72, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 58, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 37, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 31, "data_gap_recovery_extended": 15, "argument_transformation": 0, "grounded_synthesis": 28, "inconsistent_api_recovery": 35, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 39, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 29, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 38, "data_gap_recovery_extended": 39, "argument_transformation": 42, "grounded_synthesis": 44, "inconsistent_api_recovery": 36, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 45}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 38, "data_gap_recovery_extended": 39, "argument_transformation": 42, "grounded_synthesis": 44, "inconsistent_api_recovery": 36, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 45}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 147, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 148, "sequential_reasoning": 184, "error_recovery": 0, "data_gap_recovery": 155, "data_gap_recovery_extended": 120, "argument_transformation": 0, "grounded_synthesis": 280, "inconsistent_api_recovery": 280, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 156, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 180, "data_gap_recovery_extended_stateful": 56, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 290, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 147, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 168, "sequential_reasoning": 184, "error_recovery": 0, "data_gap_recovery": 115, "data_gap_recovery_extended": 45, "argument_transformation": 0, "grounded_synthesis": 166, "inconsistent_api_recovery": 336, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 191, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 140, "data_gap_recovery_extended_stateful": 27, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 206, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 28.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 18.0, "grounded_synthesis": 2.0, "inconsistent_api_recovery": 61.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 37.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 18.0, "grounded_synthesis_stateful": 26.0, "inconsistent_api_recovery_stateful": 93.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 38, "data_gap_recovery_extended": 39, "argument_transformation": 42, "grounded_synthesis": 44, "inconsistent_api_recovery": 36, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 45}, "scenarioSpeedSum": {"relevance_detection": 17.35, "argument_fidelity": 58.38, "tool_selection": 50.59, "basic_2step": 29.94, "sequential_3step": 64.09, "conditional_routing": 177.86, "sequential_reasoning": 76.38, "error_recovery": 0.0, "data_gap_recovery": 158.25, "data_gap_recovery_extended": 298.4, "argument_transformation": 345.71, "grounded_synthesis": 440.21, "inconsistent_api_recovery": 177.92, "relevance_detection_stateful": 17.11, "argument_fidelity_stateful": 59.32, "tool_selection_stateful": 49.93, "basic_2step_stateful": 33.52, "sequential_3step_stateful": 65.72, "conditional_routing_stateful": 179.65, "sequential_reasoning_stateful": 83.07, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 161.39, "data_gap_recovery_extended_stateful": 214.8, "argument_transformation_stateful": 301.57, "grounded_synthesis_stateful": 365.93, "inconsistent_api_recovery_stateful": 229.95}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 38, "data_gap_recovery_extended": 39, "argument_transformation": 42, "grounded_synthesis": 44, "inconsistent_api_recovery": 36, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 45}}, {"label": "Qwen3-8B-Q8_0 LS/P [bare]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "qwen3-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 64.5, "accuracy": 70.4, "completeness": 91.6, "efficiency": 96.1, "wasted": 0.2, "speed": 27.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 92, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 86, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 20, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 98, "sequential_3step_stateful": 98, "conditional_routing_stateful": 76, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 10, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 49, "sequential_3step_stateful": 49, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 49, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 49, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 138, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 215, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 100, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 98, "sequential_3step_stateful": 147, "conditional_routing_stateful": 152, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 138, "basic_2step": 100, "sequential_3step": 152, "conditional_routing": 217, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 224, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 91, "inconsistent_api_recovery": 460, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 98, "sequential_3step_stateful": 148, "conditional_routing_stateful": 190, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 90, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 2.0, "conditional_routing": 33.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 10.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 7.0, "inconsistent_api_recovery": 63.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 1.0, "conditional_routing_stateful": 38.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 8.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 8.0, "inconsistent_api_recovery_stateful": 49.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 49, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 74.86, "argument_fidelity": 299.61, "tool_selection": 314.48, "basic_2step": 275.2, "sequential_3step": 702.0, "conditional_routing": 1159.89, "sequential_reasoning": 488.33, "error_recovery": 0.0, "data_gap_recovery": 981.32, "data_gap_recovery_extended": 1505.89, "argument_transformation": 2741.03, "grounded_synthesis": 3346.92, "inconsistent_api_recovery": 4451.22, "relevance_detection_stateful": 81.72, "argument_fidelity_stateful": 295.84, "tool_selection_stateful": 357.44, "basic_2step_stateful": 296.93, "sequential_3step_stateful": 610.31, "conditional_routing_stateful": 1174.97, "sequential_reasoning_stateful": 510.41, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 983.37, "data_gap_recovery_extended_stateful": 1611.67, "argument_transformation_stateful": 2683.18, "grounded_synthesis_stateful": 3670.88, "inconsistent_api_recovery_stateful": 4003.37}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 49, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.1-8b-Q4_K_M LS/P [reforged]", "model": "granite-4.1-8b-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 61.5, "accuracy": 61.5, "completeness": 100.0, "efficiency": 90.4, "wasted": 0.3, "speed": 2.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 100, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 100, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 250, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 250, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 350, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 150, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 150, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 350, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 150, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 200.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 200.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 54.71, "argument_fidelity": 40.03, "tool_selection": 156.39, "basic_2step": 20.05, "sequential_3step": 40.53, "conditional_routing": 106.76, "sequential_reasoning": 53.24, "error_recovery": 23.05, "data_gap_recovery": 75.6, "data_gap_recovery_extended": 177.68, "argument_transformation": 472.34, "grounded_synthesis": 146.9, "inconsistent_api_recovery": 163.59, "relevance_detection_stateful": 57.18, "argument_fidelity_stateful": 40.76, "tool_selection_stateful": 157.13, "basic_2step_stateful": 22.04, "sequential_3step_stateful": 40.53, "conditional_routing_stateful": 101.63, "sequential_reasoning_stateful": 52.64, "error_recovery_stateful": 23.05, "data_gap_recovery_stateful": 77.1, "data_gap_recovery_extended_stateful": 177.64, "argument_transformation_stateful": 471.99, "grounded_synthesis_stateful": 134.15, "inconsistent_api_recovery_stateful": 367.67}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q8_0 LS/P [bare]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "gemma4-e4b", "quant": "q8_0", "gen": 2, "retired": false, "score": 61.9, "accuracy": 67.1, "completeness": 92.2, "efficiency": 94.5, "wasted": 0.3, "speed": 12.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 58, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 90, "data_gap_recovery_extended": 0, "argument_transformation": 24, "grounded_synthesis": 20, "inconsistent_api_recovery": 98, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 4, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 90, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 29, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 12, "grounded_synthesis": 10, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 2, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 13, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 116, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 225, "data_gap_recovery_extended": 0, "argument_transformation": 60, "grounded_synthesis": 100, "inconsistent_api_recovery": 392, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 130, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 115, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 255, "data_gap_recovery_extended": 0, "argument_transformation": 48, "grounded_synthesis": 106, "inconsistent_api_recovery": 489, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 10, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 251, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 149, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 35.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 27.0, "inconsistent_api_recovery": 105.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 2.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 31.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 41.0, "inconsistent_api_recovery_stateful": 114.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 46.73, "argument_fidelity": 168.51, "tool_selection": 151.86, "basic_2step": 83.67, "sequential_3step": 144.6, "conditional_routing": 562.96, "sequential_reasoning": 269.33, "error_recovery": 0.0, "data_gap_recovery": 791.42, "data_gap_recovery_extended": 789.16, "argument_transformation": 1389.55, "grounded_synthesis": 1231.12, "inconsistent_api_recovery": 1678.0, "relevance_detection_stateful": 49.22, "argument_fidelity_stateful": 171.4, "tool_selection_stateful": 158.88, "basic_2step_stateful": 76.52, "sequential_3step_stateful": 145.21, "conditional_routing_stateful": 587.57, "sequential_reasoning_stateful": 272.92, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 763.33, "data_gap_recovery_extended_stateful": 805.62, "argument_transformation_stateful": 1446.67, "grounded_synthesis_stateful": 1173.41, "inconsistent_api_recovery_stateful": 1708.54}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.1-8b-Q8_0 LS/P [reforged]", "model": "granite-4.1-8b-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "granite-4.1-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 61.5, "accuracy": 66.7, "completeness": 92.3, "efficiency": 73.4, "wasted": 1.0, "speed": 5.2, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 100, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 500, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 350, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 300, "sequential_reasoning": 200, "error_recovery": 200, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 695, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 350, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 300, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 200, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 695, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 200.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 100.0, "sequential_reasoning": 0.0, "error_recovery": 100.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 196.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 200.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 100.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 196.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 61.53, "tool_selection": 225.98, "basic_2step": 31.03, "sequential_3step": 61.05, "conditional_routing": 301.36, "sequential_reasoning": 78.05, "error_recovery": 67.96, "data_gap_recovery": 162.19, "data_gap_recovery_extended": 185.21, "argument_transformation": 1130.87, "grounded_synthesis": 272.71, "inconsistent_api_recovery": 526.98, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 62.19, "tool_selection_stateful": 226.38, "basic_2step_stateful": 34.9, "sequential_3step_stateful": 62.53, "conditional_routing_stateful": 289.94, "sequential_reasoning_stateful": 75.03, "error_recovery_stateful": 68.53, "data_gap_recovery_stateful": 166.51, "data_gap_recovery_extended_stateful": 185.1, "argument_transformation_stateful": 1130.68, "grounded_synthesis_stateful": 272.56, "inconsistent_api_recovery_stateful": 515.07}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma4:e4b-it-q8_0 OL/N [bare]", "model": "gemma4:e4b-it-q8_0", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "gemma4-e4b", "quant": "q8_0", "gen": 2, "retired": false, "score": 62.5, "accuracy": 69.9, "completeness": 89.3, "efficiency": 88.9, "wasted": 0.5, "speed": 12.0, "n": 50, "scenarios": {"relevance_detection": 90, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 86, "sequential_reasoning": 92, "error_recovery": 0, "data_gap_recovery": 72, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 98, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 66, "sequential_reasoning_stateful": 94, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 84, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 24, "inconsistent_api_recovery": 25, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 33, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 21, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 47, "argument_transformation": 48, "grounded_synthesis": 49, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioValidated": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 47, "argument_transformation": 48, "grounded_synthesis": 49, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioIdealCalls": {"relevance_detection": 45, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 172, "sequential_reasoning": 184, "error_recovery": 0, "data_gap_recovery": 180, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 240, "inconsistent_api_recovery": 200, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 132, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 210, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 210, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 45, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 206, "sequential_reasoning": 184, "error_recovery": 0, "data_gap_recovery": 191, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 365, "inconsistent_api_recovery": 244, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 164, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 224, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 313, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 34.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 14.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 187.0, "inconsistent_api_recovery": 71.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 32.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 18.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 159.0, "inconsistent_api_recovery_stateful": 68.0}, "scenarioWastedN": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 47, "argument_transformation": 48, "grounded_synthesis": 49, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioSpeedSum": {"relevance_detection": 104.32, "argument_fidelity": 144.59, "tool_selection": 135.98, "basic_2step": 86.76, "sequential_3step": 146.31, "conditional_routing": 524.41, "sequential_reasoning": 231.32, "error_recovery": 0.0, "data_gap_recovery": 547.74, "data_gap_recovery_extended": 823.16, "argument_transformation": 1444.28, "grounded_synthesis": 1187.06, "inconsistent_api_recovery": 1562.61, "relevance_detection_stateful": 122.64, "argument_fidelity_stateful": 138.12, "tool_selection_stateful": 134.42, "basic_2step_stateful": 78.11, "sequential_3step_stateful": 154.8, "conditional_routing_stateful": 545.16, "sequential_reasoning_stateful": 245.18, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 654.16, "data_gap_recovery_extended_stateful": 756.53, "argument_transformation_stateful": 1450.34, "grounded_synthesis_stateful": 1143.72, "inconsistent_api_recovery_stateful": 1561.77}, "scenarioSpeedN": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 47, "argument_transformation": 48, "grounded_synthesis": 49, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}}, {"label": "gemma4:e4b-it-q4_K_M OL/N [bare]", "model": "gemma4:e4b-it-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 62.3, "accuracy": 71.7, "completeness": 86.9, "efficiency": 89.8, "wasted": 0.4, "speed": 8.7, "n": 50, "scenarios": {"relevance_detection": 98, "argument_fidelity": 100, "tool_selection": 88, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 96, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 82, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 50, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 96, "argument_fidelity_stateful": 100, "tool_selection_stateful": 92, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 84, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 72, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 25, "inconsistent_api_recovery": 12, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 46, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 19, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 25, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 46, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 33}, "scenarioValidated": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 25, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 46, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 33}, "scenarioIdealCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 132, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 192, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 205, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 250, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 150, "tool_selection_stateful": 138, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 168, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 180, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 190, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 132, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 236, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 222, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 357, "inconsistent_api_recovery": 113, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 150, "tool_selection_stateful": 138, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 210, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 192, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 275, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 44.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 22.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 160.0, "inconsistent_api_recovery": 22.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 42.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 20.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 8.0, "grounded_synthesis_stateful": 144.0, "inconsistent_api_recovery_stateful": 36.0}, "scenarioWastedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 25, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 46, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 33}, "scenarioSpeedSum": {"relevance_detection": 88.15, "argument_fidelity": 107.22, "tool_selection": 100.8, "basic_2step": 62.48, "sequential_3step": 110.07, "conditional_routing": 416.86, "sequential_reasoning": 221.52, "error_recovery": 0.0, "data_gap_recovery": 534.08, "data_gap_recovery_extended": 705.24, "argument_transformation": 912.43, "grounded_synthesis": 816.48, "inconsistent_api_recovery": 652.24, "relevance_detection_stateful": 99.63, "argument_fidelity_stateful": 107.96, "tool_selection_stateful": 98.95, "basic_2step_stateful": 63.36, "sequential_3step_stateful": 109.32, "conditional_routing_stateful": 451.5, "sequential_reasoning_stateful": 205.7, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 559.44, "data_gap_recovery_extended_stateful": 704.37, "argument_transformation_stateful": 993.37, "grounded_synthesis_stateful": 899.33, "inconsistent_api_recovery_stateful": 776.24}, "scenarioSpeedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 25, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 46, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 33}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/P [bare]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 60.6, "accuracy": 65.8, "completeness": 92.1, "efficiency": 91.8, "wasted": 0.5, "speed": 8.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 54, "sequential_reasoning": 96, "error_recovery": 0, "data_gap_recovery": 90, "data_gap_recovery_extended": 0, "argument_transformation": 14, "grounded_synthesis": 22, "inconsistent_api_recovery": 92, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 12, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 76, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 27, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 7, "grounded_synthesis": 11, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 6, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 38, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 108, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 225, "data_gap_recovery_extended": 0, "argument_transformation": 35, "grounded_synthesis": 110, "inconsistent_api_recovery": 368, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 24, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 190, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 116, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 28, "grounded_synthesis": 154, "inconsistent_api_recovery": 477, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 29, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 208, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 143, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 8.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 30.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 122.0, "inconsistent_api_recovery": 111.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 5.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 26.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 121.0, "inconsistent_api_recovery_stateful": 129.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 32.37, "argument_fidelity": 81.01, "tool_selection": 109.91, "basic_2step": 45.84, "sequential_3step": 102.94, "conditional_routing": 438.7, "sequential_reasoning": 168.69, "error_recovery": 0.0, "data_gap_recovery": 515.63, "data_gap_recovery_extended": 454.86, "argument_transformation": 1176.23, "grounded_synthesis": 864.11, "inconsistent_api_recovery": 1076.24, "relevance_detection_stateful": 35.78, "argument_fidelity_stateful": 83.44, "tool_selection_stateful": 103.11, "basic_2step_stateful": 45.77, "sequential_3step_stateful": 105.71, "conditional_routing_stateful": 432.01, "sequential_reasoning_stateful": 165.16, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 509.76, "data_gap_recovery_extended_stateful": 458.21, "argument_transformation_stateful": 1142.14, "grounded_synthesis_stateful": 861.5, "inconsistent_api_recovery_stateful": 991.69}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "Qwen3.5-35B-A3B-Q4_K_M LS/P [bare]", "model": "Qwen3.5-35B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "qwen3.5-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 59.7, "accuracy": 77.0, "completeness": 77.5, "efficiency": 100.0, "wasted": 0.1, "speed": 9.8, "n": 50, "scenarios": {"relevance_detection": 30, "argument_fidelity": 100, "tool_selection": 2, "basic_2step": 98, "sequential_3step": 98, "conditional_routing": 96, "sequential_reasoning": 98, "error_recovery": 0, "data_gap_recovery": 92, "data_gap_recovery_extended": 60, "argument_transformation": 20, "grounded_synthesis": 44, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 100, "tool_selection_stateful": 4, "basic_2step_stateful": 92, "sequential_3step_stateful": 100, "conditional_routing_stateful": 90, "sequential_reasoning_stateful": 96, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 58, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 2}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 15, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 49, "sequential_3step": 49, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 30, "argument_transformation": 10, "grounded_synthesis": 22, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 24, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 46, "sequential_3step_stateful": 50, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 29, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 14, "inconsistent_api_recovery_stateful": 1}, "scenarioCompleted": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 33, "argument_transformation": 49, "grounded_synthesis": 29, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 47, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 46, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 33, "argument_transformation": 49, "grounded_synthesis": 29, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 47, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 46, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 15, "argument_fidelity": 150, "tool_selection": 3, "basic_2step": 98, "sequential_3step": 147, "conditional_routing": 192, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 230, "data_gap_recovery_extended": 240, "argument_transformation": 50, "grounded_synthesis": 220, "inconsistent_api_recovery": 384, "relevance_detection_stateful": 24, "argument_fidelity_stateful": 150, "tool_selection_stateful": 6, "basic_2step_stateful": 92, "sequential_3step_stateful": 150, "conditional_routing_stateful": 180, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 232, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 140, "inconsistent_api_recovery_stateful": 8}, "scenarioActualCalls": {"relevance_detection": 15, "argument_fidelity": 151, "tool_selection": 3, "basic_2step": 99, "sequential_3step": 148, "conditional_routing": 207, "sequential_reasoning": 197, "error_recovery": 0, "data_gap_recovery": 177, "data_gap_recovery_extended": 116, "argument_transformation": 33, "grounded_synthesis": 116, "inconsistent_api_recovery": 196, "relevance_detection_stateful": 24, "argument_fidelity_stateful": 150, "tool_selection_stateful": 6, "basic_2step_stateful": 93, "sequential_3step_stateful": 151, "conditional_routing_stateful": 185, "sequential_reasoning_stateful": 193, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 185, "data_gap_recovery_extended_stateful": 106, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 80, "inconsistent_api_recovery_stateful": 4}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 1.0, "tool_selection": 0.0, "basic_2step": 1.0, "sequential_3step": 1.0, "conditional_routing": 31.0, "sequential_reasoning": 1.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 1.0, "conditional_routing_stateful": 26.0, "sequential_reasoning_stateful": 1.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 6.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 15.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 33, "argument_transformation": 49, "grounded_synthesis": 29, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 47, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 46, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 43.33, "argument_fidelity": 167.39, "tool_selection": 1.81, "basic_2step": 338.95, "sequential_3step": 542.59, "conditional_routing": 450.82, "sequential_reasoning": 332.46, "error_recovery": 0.0, "data_gap_recovery": 383.46, "data_gap_recovery_extended": 291.55, "argument_transformation": 1087.66, "grounded_synthesis": 594.92, "inconsistent_api_recovery": 612.19, "relevance_detection_stateful": 47.18, "argument_fidelity_stateful": 233.27, "tool_selection_stateful": 4.0, "basic_2step_stateful": 182.56, "sequential_3step_stateful": 556.96, "conditional_routing_stateful": 433.81, "sequential_reasoning_stateful": 308.72, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 383.72, "data_gap_recovery_extended_stateful": 383.58, "argument_transformation_stateful": 1171.86, "grounded_synthesis_stateful": 640.78, "inconsistent_api_recovery_stateful": 705.34}, "scenarioSpeedN": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 33, "argument_transformation": 49, "grounded_synthesis": 29, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 47, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 46, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [bare]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "ministral-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 59.7, "accuracy": 80.7, "completeness": 74.0, "efficiency": 100.0, "wasted": 0.4, "speed": 3.2, "n": 50, "scenarios": {"relevance_detection": 48, "argument_fidelity": 92, "tool_selection": 38, "basic_2step": 100, "sequential_3step": 96, "conditional_routing": 98, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 94, "data_gap_recovery_extended": 54, "argument_transformation": 10, "grounded_synthesis": 16, "inconsistent_api_recovery": 78, "relevance_detection_stateful": 54, "argument_fidelity_stateful": 92, "tool_selection_stateful": 32, "basic_2step_stateful": 100, "sequential_3step_stateful": 98, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 90, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 14, "inconsistent_api_recovery_stateful": 2}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 24, "argument_fidelity": 46, "tool_selection": 19, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 27, "argument_transformation": 5, "grounded_synthesis": 8, "inconsistent_api_recovery": 39, "relevance_detection_stateful": 27, "argument_fidelity_stateful": 46, "tool_selection_stateful": 16, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 7, "inconsistent_api_recovery_stateful": 1}, "scenarioCompleted": {"relevance_detection": 24, "argument_fidelity": 46, "tool_selection": 19, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 38, "argument_transformation": 21, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 27, "argument_fidelity_stateful": 46, "tool_selection_stateful": 16, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 14, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 24, "argument_fidelity": 46, "tool_selection": 19, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 38, "argument_transformation": 21, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 27, "argument_fidelity_stateful": 46, "tool_selection_stateful": 16, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 14, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 24, "argument_fidelity": 138, "tool_selection": 57, "basic_2step": 100, "sequential_3step": 144, "conditional_routing": 196, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 235, "data_gap_recovery_extended": 216, "argument_transformation": 25, "grounded_synthesis": 80, "inconsistent_api_recovery": 312, "relevance_detection_stateful": 27, "argument_fidelity_stateful": 138, "tool_selection_stateful": 48, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 192, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 70, "inconsistent_api_recovery_stateful": 8}, "scenarioActualCalls": {"relevance_detection": 24, "argument_fidelity": 138, "tool_selection": 57, "basic_2step": 100, "sequential_3step": 144, "conditional_routing": 239, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 207, "data_gap_recovery_extended": 178, "argument_transformation": 29, "grounded_synthesis": 73, "inconsistent_api_recovery": 343, "relevance_detection_stateful": 27, "argument_fidelity_stateful": 138, "tool_selection_stateful": 48, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 239, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 198, "data_gap_recovery_extended_stateful": 134, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 12}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 46.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 46.0, "grounded_synthesis": 47.0, "inconsistent_api_recovery": 66.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 46.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 40.0, "grounded_synthesis_stateful": 11.0, "inconsistent_api_recovery_stateful": 86.0}, "scenarioWastedN": {"relevance_detection": 24, "argument_fidelity": 46, "tool_selection": 19, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 38, "argument_transformation": 21, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 27, "argument_fidelity_stateful": 46, "tool_selection_stateful": 16, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 14, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 13.95, "argument_fidelity": 54.45, "tool_selection": 20.23, "basic_2step": 29.03, "sequential_3step": 61.97, "conditional_routing": 172.96, "sequential_reasoning": 79.03, "error_recovery": 0.0, "data_gap_recovery": 200.63, "data_gap_recovery_extended": 285.5, "argument_transformation": 146.46, "grounded_synthesis": 257.65, "inconsistent_api_recovery": 256.18, "relevance_detection_stateful": 14.84, "argument_fidelity_stateful": 52.12, "tool_selection_stateful": 17.43, "basic_2step_stateful": 32.56, "sequential_3step_stateful": 61.74, "conditional_routing_stateful": 165.54, "sequential_reasoning_stateful": 78.17, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 182.25, "data_gap_recovery_extended_stateful": 290.3, "argument_transformation_stateful": 100.75, "grounded_synthesis_stateful": 282.27, "inconsistent_api_recovery_stateful": 252.34}, "scenarioSpeedN": {"relevance_detection": 24, "argument_fidelity": 46, "tool_selection": 19, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 38, "argument_transformation": 21, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 27, "argument_fidelity_stateful": 46, "tool_selection_stateful": 16, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 14, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 50}}, {"label": "Nemotron-3-Nano-30B-A3B-Q4_K_M LS/P [bare]", "model": "Nemotron-3-Nano-30B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "nemotron-3-nano", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 58.6, "accuracy": 65.1, "completeness": 90.1, "efficiency": 99.9, "wasted": 0.1, "speed": 10.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 98, "sequential_3step": 98, "conditional_routing": 48, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 88, "data_gap_recovery_extended": 2, "argument_transformation": 4, "grounded_synthesis": 0, "inconsistent_api_recovery": 98, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 94, "sequential_3step_stateful": 96, "conditional_routing_stateful": 6, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 49, "conditional_routing": 24, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 1, "argument_transformation": 2, "grounded_synthesis": 0, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 47, "sequential_3step_stateful": 48, "conditional_routing_stateful": 3, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 48, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 48, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 98, "sequential_3step": 147, "conditional_routing": 96, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 220, "data_gap_recovery_extended": 8, "argument_transformation": 10, "grounded_synthesis": 0, "inconsistent_api_recovery": 392, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 94, "sequential_3step_stateful": 144, "conditional_routing_stateful": 12, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 151, "basic_2step": 99, "sequential_3step": 147, "conditional_routing": 85, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 222, "data_gap_recovery_extended": 7, "argument_transformation": 8, "grounded_synthesis": 0, "inconsistent_api_recovery": 399, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 95, "sequential_3step_stateful": 144, "conditional_routing_stateful": 13, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 223, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 11, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 1.0, "basic_2step": 1.0, "sequential_3step": 0.0, "conditional_routing": 2.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 7.0, "inconsistent_api_recovery": 32.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 1.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 1.0, "inconsistent_api_recovery_stateful": 40.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 48, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 58.29, "argument_fidelity": 124.79, "tool_selection": 117.68, "basic_2step": 258.74, "sequential_3step": 179.96, "conditional_routing": 339.3, "sequential_reasoning": 281.59, "error_recovery": 0.0, "data_gap_recovery": 552.02, "data_gap_recovery_extended": 780.6, "argument_transformation": 753.67, "grounded_synthesis": 617.04, "inconsistent_api_recovery": 1729.27, "relevance_detection_stateful": 68.4, "argument_fidelity_stateful": 119.64, "tool_selection_stateful": 106.44, "basic_2step_stateful": 295.71, "sequential_3step_stateful": 180.8, "conditional_routing_stateful": 334.02, "sequential_reasoning_stateful": 268.68, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 529.17, "data_gap_recovery_extended_stateful": 790.3, "argument_transformation_stateful": 790.39, "grounded_synthesis_stateful": 618.4, "inconsistent_api_recovery_stateful": 1794.3}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 48, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "phi-4-Q4_K_M LS/P [bare]", "model": "phi-4-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "phi-4", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 59.2, "accuracy": 69.4, "completeness": 85.3, "efficiency": 91.9, "wasted": 0.5, "speed": 3.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 22, "sequential_reasoning": 68, "error_recovery": 0, "data_gap_recovery": 88, "data_gap_recovery_extended": 44, "argument_transformation": 28, "grounded_synthesis": 24, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 10, "sequential_reasoning_stateful": 76, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 92, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 18, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 11, "sequential_reasoning": 34, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 22, "argument_transformation": 14, "grounded_synthesis": 12, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 5, "sequential_reasoning_stateful": 38, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 19, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 9, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 38, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 44, "argument_transformation": 48, "grounded_synthesis": 29, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 38, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 44, "argument_transformation": 48, "grounded_synthesis": 29, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 44, "sequential_reasoning": 136, "error_recovery": 0, "data_gap_recovery": 220, "data_gap_recovery_extended": 176, "argument_transformation": 70, "grounded_synthesis": 120, "inconsistent_api_recovery": 128, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 20, "sequential_reasoning_stateful": 152, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 152, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 90, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 94, "sequential_3step": 150, "conditional_routing": 54, "sequential_reasoning": 136, "error_recovery": 0, "data_gap_recovery": 276, "data_gap_recovery_extended": 148, "argument_transformation": 48, "grounded_synthesis": 212, "inconsistent_api_recovery": 169, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 98, "sequential_3step_stateful": 150, "conditional_routing_stateful": 25, "sequential_reasoning_stateful": 152, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 283, "data_gap_recovery_extended_stateful": 129, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 155, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 10.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 59.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 1.0, "grounded_synthesis": 178.0, "inconsistent_api_recovery": 42.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 5.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 56.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 2.0, "grounded_synthesis_stateful": 123.0, "inconsistent_api_recovery_stateful": 47.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 38, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 44, "argument_transformation": 48, "grounded_synthesis": 29, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 22.78, "argument_fidelity": 54.67, "tool_selection": 44.93, "basic_2step": 31.25, "sequential_3step": 49.11, "conditional_routing": 161.09, "sequential_reasoning": 79.2, "error_recovery": 0.0, "data_gap_recovery": 184.06, "data_gap_recovery_extended": 194.9, "argument_transformation": 539.3, "grounded_synthesis": 277.76, "inconsistent_api_recovery": 210.36, "relevance_detection_stateful": 22.31, "argument_fidelity_stateful": 54.18, "tool_selection_stateful": 44.71, "basic_2step_stateful": 41.84, "sequential_3step_stateful": 49.07, "conditional_routing_stateful": 166.03, "sequential_reasoning_stateful": 80.52, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 175.57, "data_gap_recovery_extended_stateful": 189.5, "argument_transformation_stateful": 557.33, "grounded_synthesis_stateful": 241.15, "inconsistent_api_recovery_stateful": 238.9}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 38, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 44, "argument_transformation": 48, "grounded_synthesis": 29, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite4.1:8b-q4_K_M OL/N [reforged]", "model": "granite4.1:8b-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 57.8, "accuracy": 57.8, "completeness": 100.0, "efficiency": 80.7, "wasted": 1.3, "speed": 1.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 250, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 201, "data_gap_recovery": 6, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 250, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 7, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 100.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 101.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 400.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 150.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 100.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 400.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 150.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 25.69, "argument_fidelity": 49.69, "tool_selection": 73.35, "basic_2step": 27.18, "sequential_3step": 41.51, "conditional_routing": 106.65, "sequential_reasoning": 54.63, "error_recovery": 55.75, "data_gap_recovery": 127.89, "data_gap_recovery_extended": 155.87, "argument_transformation": 188.25, "grounded_synthesis": 172.23, "inconsistent_api_recovery": 180.94, "relevance_detection_stateful": 25.68, "argument_fidelity_stateful": 49.71, "tool_selection_stateful": 73.35, "basic_2step_stateful": 23.04, "sequential_3step_stateful": 41.57, "conditional_routing_stateful": 97.19, "sequential_reasoning_stateful": 54.61, "error_recovery_stateful": 48.48, "data_gap_recovery_stateful": 126.85, "data_gap_recovery_extended_stateful": 155.86, "argument_transformation_stateful": 188.23, "grounded_synthesis_stateful": 172.18, "inconsistent_api_recovery_stateful": 180.86}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q4_K_M LS/P [bare]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 57.7, "accuracy": 66.7, "completeness": 86.5, "efficiency": 96.7, "wasted": 0.2, "speed": 17.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 50, "basic_2step": 98, "sequential_3step": 100, "conditional_routing": 92, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 14, "grounded_synthesis": 12, "inconsistent_api_recovery": 92, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 56, "basic_2step_stateful": 94, "sequential_3step_stateful": 100, "conditional_routing_stateful": 76, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 62, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 25, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 46, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 25, "data_gap_recovery_extended": 0, "argument_transformation": 7, "grounded_synthesis": 6, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 28, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 31, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 25, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 38, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 28, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 25, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 38, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 28, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 75, "basic_2step": 98, "sequential_3step": 150, "conditional_routing": 184, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 125, "data_gap_recovery_extended": 0, "argument_transformation": 35, "grounded_synthesis": 60, "inconsistent_api_recovery": 368, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 84, "basic_2step_stateful": 94, "sequential_3step_stateful": 150, "conditional_routing_stateful": 152, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 155, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 75, "basic_2step": 98, "sequential_3step": 150, "conditional_routing": 211, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 119, "data_gap_recovery_extended": 0, "argument_transformation": 28, "grounded_synthesis": 50, "inconsistent_api_recovery": 430, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 84, "basic_2step_stateful": 94, "sequential_3step_stateful": 150, "conditional_routing_stateful": 183, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 149, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 17, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 30.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 3.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 66.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 31.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 5.0, "data_gap_recovery_extended_stateful": 2.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 67.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 25, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 38, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 28, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 75.15, "argument_fidelity": 189.92, "tool_selection": 119.9, "basic_2step": 186.73, "sequential_3step": 417.15, "conditional_routing": 867.16, "sequential_reasoning": 343.46, "error_recovery": 0.0, "data_gap_recovery": 697.49, "data_gap_recovery_extended": 778.58, "argument_transformation": 1703.23, "grounded_synthesis": 1874.36, "inconsistent_api_recovery": 2563.07, "relevance_detection_stateful": 73.56, "argument_fidelity_stateful": 189.96, "tool_selection_stateful": 132.76, "basic_2step_stateful": 228.08, "sequential_3step_stateful": 424.49, "conditional_routing_stateful": 876.18, "sequential_reasoning_stateful": 327.73, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 671.13, "data_gap_recovery_extended_stateful": 809.94, "argument_transformation_stateful": 1566.49, "grounded_synthesis_stateful": 1649.13, "inconsistent_api_recovery_stateful": 2708.67}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 25, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 38, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 28, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [bare]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "ministral-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 54.9, "accuracy": 79.3, "completeness": 69.2, "efficiency": 100.0, "wasted": 0.3, "speed": 2.6, "n": 50, "scenarios": {"relevance_detection": 56, "argument_fidelity": 78, "tool_selection": 8, "basic_2step": 100, "sequential_3step": 96, "conditional_routing": 98, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 68, "data_gap_recovery_extended": 60, "argument_transformation": 10, "grounded_synthesis": 18, "inconsistent_api_recovery": 64, "relevance_detection_stateful": 60, "argument_fidelity_stateful": 82, "tool_selection_stateful": 8, "basic_2step_stateful": 100, "sequential_3step_stateful": 96, "conditional_routing_stateful": 92, "sequential_reasoning_stateful": 96, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 72, "data_gap_recovery_extended_stateful": 56, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 28, "argument_fidelity": 39, "tool_selection": 4, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 34, "data_gap_recovery_extended": 30, "argument_transformation": 5, "grounded_synthesis": 9, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 30, "argument_fidelity_stateful": 41, "tool_selection_stateful": 4, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 28, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 28, "argument_fidelity": 39, "tool_selection": 4, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 37, "data_gap_recovery_extended": 36, "argument_transformation": 23, "grounded_synthesis": 38, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 30, "argument_fidelity_stateful": 41, "tool_selection_stateful": 4, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 37, "argument_transformation_stateful": 17, "grounded_synthesis_stateful": 36, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 28, "argument_fidelity": 39, "tool_selection": 4, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 37, "data_gap_recovery_extended": 36, "argument_transformation": 23, "grounded_synthesis": 38, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 30, "argument_fidelity_stateful": 41, "tool_selection_stateful": 4, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 37, "argument_transformation_stateful": 17, "grounded_synthesis_stateful": 36, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 28, "argument_fidelity": 117, "tool_selection": 12, "basic_2step": 100, "sequential_3step": 144, "conditional_routing": 196, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 170, "data_gap_recovery_extended": 240, "argument_transformation": 25, "grounded_synthesis": 90, "inconsistent_api_recovery": 256, "relevance_detection_stateful": 30, "argument_fidelity_stateful": 123, "tool_selection_stateful": 12, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 184, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 180, "data_gap_recovery_extended_stateful": 224, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 28, "argument_fidelity": 117, "tool_selection": 12, "basic_2step": 100, "sequential_3step": 144, "conditional_routing": 231, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 146, "data_gap_recovery_extended": 188, "argument_transformation": 26, "grounded_synthesis": 57, "inconsistent_api_recovery": 300, "relevance_detection_stateful": 30, "argument_fidelity_stateful": 123, "tool_selection_stateful": 12, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 213, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 161, "data_gap_recovery_extended_stateful": 180, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 42.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 3.0, "argument_transformation": 17.0, "grounded_synthesis": 9.0, "inconsistent_api_recovery": 72.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 37.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 6.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 10.0, "grounded_synthesis_stateful": 24.0, "inconsistent_api_recovery_stateful": 67.0}, "scenarioWastedN": {"relevance_detection": 28, "argument_fidelity": 39, "tool_selection": 4, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 37, "data_gap_recovery_extended": 36, "argument_transformation": 23, "grounded_synthesis": 38, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 30, "argument_fidelity_stateful": 41, "tool_selection_stateful": 4, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 37, "argument_transformation_stateful": 17, "grounded_synthesis_stateful": 36, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 10.49, "argument_fidelity": 30.48, "tool_selection": 2.63, "basic_2step": 19.04, "sequential_3step": 42.93, "conditional_routing": 113.47, "sequential_reasoning": 50.81, "error_recovery": 0.0, "data_gap_recovery": 100.34, "data_gap_recovery_extended": 184.67, "argument_transformation": 128.57, "grounded_synthesis": 298.94, "inconsistent_api_recovery": 178.25, "relevance_detection_stateful": 9.97, "argument_fidelity_stateful": 31.97, "tool_selection_stateful": 4.81, "basic_2step_stateful": 21.32, "sequential_3step_stateful": 41.41, "conditional_routing_stateful": 108.76, "sequential_reasoning_stateful": 50.38, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 99.94, "data_gap_recovery_extended_stateful": 193.76, "argument_transformation_stateful": 90.83, "grounded_synthesis_stateful": 377.57, "inconsistent_api_recovery_stateful": 181.38}, "scenarioSpeedN": {"relevance_detection": 28, "argument_fidelity": 39, "tool_selection": 4, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 37, "data_gap_recovery_extended": 36, "argument_transformation": 23, "grounded_synthesis": 38, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 30, "argument_fidelity_stateful": 41, "tool_selection_stateful": 4, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 37, "argument_transformation_stateful": 17, "grounded_synthesis_stateful": 36, "inconsistent_api_recovery_stateful": 49}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q8_0 LS/P [reforged]", "model": "Meta-Llama-3.1-8B-Instruct-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 54.2, "accuracy": 59.5, "completeness": 91.1, "efficiency": 74.7, "wasted": 2.0, "speed": 3.1, "n": 50, "scenarios": {"relevance_detection": 98, "argument_fidelity": 92, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 84, "sequential_reasoning": 0, "error_recovery": 24, "data_gap_recovery": 38, "data_gap_recovery_extended": 20, "argument_transformation": 0, "grounded_synthesis": 34, "inconsistent_api_recovery": 62, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 88, "tool_selection_stateful": 98, "basic_2step_stateful": 100, "sequential_3step_stateful": 32, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 16, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 18, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 48}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 49, "argument_fidelity": 46, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 42, "sequential_reasoning": 0, "error_recovery": 12, "data_gap_recovery": 19, "data_gap_recovery_extended": 10, "argument_transformation": 0, "grounded_synthesis": 17, "inconsistent_api_recovery": 31, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 44, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 16, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 8, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 15, "inconsistent_api_recovery_stateful": 24}, "scenarioCompleted": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 11, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 17, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 11, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 17, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 49, "argument_fidelity": 138, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 168, "sequential_reasoning": 0, "error_recovery": 24, "data_gap_recovery": 95, "data_gap_recovery_extended": 80, "argument_transformation": 0, "grounded_synthesis": 170, "inconsistent_api_recovery": 248, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 132, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 48, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 24, "data_gap_recovery_stateful": 75, "data_gap_recovery_extended_stateful": 72, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 192}, "scenarioActualCalls": {"relevance_detection": 56, "argument_fidelity": 141, "tool_selection": 124, "basic_2step": 106, "sequential_3step": 206, "conditional_routing": 251, "sequential_reasoning": 0, "error_recovery": 75, "data_gap_recovery": 80, "data_gap_recovery_extended": 64, "argument_transformation": 0, "grounded_synthesis": 291, "inconsistent_api_recovery": 429, "relevance_detection_stateful": 55, "argument_fidelity_stateful": 134, "tool_selection_stateful": 161, "basic_2step_stateful": 112, "sequential_3step_stateful": 63, "conditional_routing_stateful": 280, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 60, "data_gap_recovery_stateful": 70, "data_gap_recovery_extended_stateful": 55, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 261, "inconsistent_api_recovery_stateful": 344}, "scenarioWastedSum": {"relevance_detection": 7.0, "argument_fidelity": 3.0, "tool_selection": 11.0, "basic_2step": 6.0, "sequential_3step": 56.0, "conditional_routing": 100.0, "sequential_reasoning": 36.0, "error_recovery": 306.0, "data_gap_recovery": 25.0, "data_gap_recovery_extended": 17.0, "argument_transformation": 45.0, "grounded_synthesis": 339.0, "inconsistent_api_recovery": 308.0, "relevance_detection_stateful": 5.0, "argument_fidelity_stateful": 2.0, "tool_selection_stateful": 14.0, "basic_2step_stateful": 12.0, "sequential_3step_stateful": 26.0, "conditional_routing_stateful": 92.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 218.0, "data_gap_recovery_stateful": 21.0, "data_gap_recovery_extended_stateful": 6.0, "argument_transformation_stateful": 48.0, "grounded_synthesis_stateful": 338.0, "inconsistent_api_recovery_stateful": 300.0}, "scenarioWastedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 11, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 17, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 16.34, "argument_fidelity": 48.98, "tool_selection": 52.51, "basic_2step": 30.74, "sequential_3step": 77.1, "conditional_routing": 139.17, "sequential_reasoning": 73.98, "error_recovery": 89.76, "data_gap_recovery": 158.52, "data_gap_recovery_extended": 189.64, "argument_transformation": 156.79, "grounded_synthesis": 464.56, "inconsistent_api_recovery": 371.67, "relevance_detection_stateful": 15.59, "argument_fidelity_stateful": 49.85, "tool_selection_stateful": 63.87, "basic_2step_stateful": 36.81, "sequential_3step_stateful": 26.37, "conditional_routing_stateful": 131.04, "sequential_reasoning_stateful": 79.23, "error_recovery_stateful": 83.12, "data_gap_recovery_stateful": 148.06, "data_gap_recovery_extended_stateful": 167.06, "argument_transformation_stateful": 139.25, "grounded_synthesis_stateful": 463.7, "inconsistent_api_recovery_stateful": 373.2}, "scenarioSpeedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 11, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 17, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-Nemo-Instruct-2407.Q4_K_M LF/P [bare]", "model": "Mistral-Nemo-Instruct-2407.Q4_K_M", "backend": "llamafile", "mode": "prompt", "ablation": "bare", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 53.5, "accuracy": 60.6, "completeness": 88.3, "efficiency": 100.0, "wasted": 0.3, "speed": 3.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 92, "error_recovery": 0, "data_gap_recovery": 14, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 56, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 54, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 86, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 28, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 27, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 24, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 22, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 46, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 46, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 184, "error_recovery": 0, "data_gap_recovery": 35, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 280, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 81, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 172, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 220, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 132, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 182, "sequential_reasoning": 179, "error_recovery": 0, "data_gap_recovery": 28, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 168, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 81, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 86, "sequential_reasoning_stateful": 170, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 132, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 2.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 3.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 165.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 4.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 13.0, "data_gap_recovery_extended_stateful": 4.0, "argument_transformation_stateful": 172.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 11.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 46, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 37.18, "argument_fidelity": 99.63, "tool_selection": 88.41, "basic_2step": 64.63, "sequential_3step": 96.22, "conditional_routing": 205.88, "sequential_reasoning": 145.82, "error_recovery": 0.0, "data_gap_recovery": 168.02, "data_gap_recovery_extended": 258.73, "argument_transformation": 365.86, "grounded_synthesis": 434.01, "inconsistent_api_recovery": 237.27, "relevance_detection_stateful": 37.86, "argument_fidelity_stateful": 99.84, "tool_selection_stateful": 82.77, "basic_2step_stateful": 68.97, "sequential_3step_stateful": 95.46, "conditional_routing_stateful": 221.47, "sequential_reasoning_stateful": 142.03, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 170.87, "data_gap_recovery_extended_stateful": 237.37, "argument_transformation_stateful": 384.68, "grounded_synthesis_stateful": 310.99, "inconsistent_api_recovery_stateful": 241.59}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 46, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.1-8b-Q4_K_M LS/N [bare]", "model": "granite-4.1-8b-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 53.8, "accuracy": 70.0, "completeness": 76.9, "efficiency": 96.0, "wasted": 0.2, "speed": 1.9, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 50.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 50.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 40.96, "tool_selection": 33.65, "basic_2step": 22.09, "sequential_3step": 34.1, "conditional_routing": 263.22, "sequential_reasoning": 52.18, "error_recovery": 0.0, "data_gap_recovery": 128.78, "data_gap_recovery_extended": 166.76, "argument_transformation": 0.0, "grounded_synthesis": 158.9, "inconsistent_api_recovery": 127.59, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 41.76, "tool_selection_stateful": 33.85, "basic_2step_stateful": 24.56, "sequential_3step_stateful": 34.04, "conditional_routing_stateful": 93.54, "sequential_reasoning_stateful": 58.16, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 133.02, "data_gap_recovery_extended_stateful": 170.73, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 144.64, "inconsistent_api_recovery_stateful": 134.95}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Meta-Llama-3.1-8B-Instruct.Q8_0 LF/P [reforged]", "model": "Meta-Llama-3.1-8B-Instruct.Q8_0", "backend": "llamafile", "mode": "prompt", "ablation": "reforged", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 53.3, "accuracy": 58.9, "completeness": 90.5, "efficiency": 73.0, "wasted": 1.8, "speed": 5.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 92, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 86, "sequential_reasoning": 10, "error_recovery": 32, "data_gap_recovery": 28, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 48, "inconsistent_api_recovery": 52, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 92, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 90, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 22, "data_gap_recovery_stateful": 32, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 42}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 46, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 5, "error_recovery": 16, "data_gap_recovery": 14, "data_gap_recovery_extended": 4, "argument_transformation": 0, "grounded_synthesis": 24, "inconsistent_api_recovery": 26, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 46, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 11, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 21}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 18, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 1, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 18, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 1, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 138, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 172, "sequential_reasoning": 20, "error_recovery": 32, "data_gap_recovery": 70, "data_gap_recovery_extended": 32, "argument_transformation": 0, "grounded_synthesis": 240, "inconsistent_api_recovery": 208, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 138, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 180, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 33, "data_gap_recovery_stateful": 80, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 200, "inconsistent_api_recovery_stateful": 168}, "scenarioActualCalls": {"relevance_detection": 52, "argument_fidelity": 148, "tool_selection": 191, "basic_2step": 100, "sequential_3step": 205, "conditional_routing": 231, "sequential_reasoning": 16, "error_recovery": 76, "data_gap_recovery": 64, "data_gap_recovery_extended": 34, "argument_transformation": 0, "grounded_synthesis": 393, "inconsistent_api_recovery": 366, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 148, "tool_selection_stateful": 191, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 245, "sequential_reasoning_stateful": 5, "error_recovery_stateful": 55, "data_gap_recovery_stateful": 76, "data_gap_recovery_extended_stateful": 41, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 343, "inconsistent_api_recovery_stateful": 302}, "scenarioWastedSum": {"relevance_detection": 2.0, "argument_fidelity": 10.0, "tool_selection": 43.0, "basic_2step": 0.0, "sequential_3step": 55.0, "conditional_routing": 74.0, "sequential_reasoning": 26.0, "error_recovery": 207.0, "data_gap_recovery": 30.0, "data_gap_recovery_extended": 40.0, "argument_transformation": 81.0, "grounded_synthesis": 292.0, "inconsistent_api_recovery": 291.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 11.0, "tool_selection_stateful": 41.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 10.0, "conditional_routing_stateful": 75.0, "sequential_reasoning_stateful": 37.0, "error_recovery_stateful": 113.0, "data_gap_recovery_stateful": 23.0, "data_gap_recovery_extended_stateful": 28.0, "argument_transformation_stateful": 66.0, "grounded_synthesis_stateful": 289.0, "inconsistent_api_recovery_stateful": 300.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 18, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 1, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 26.34, "argument_fidelity": 99.29, "tool_selection": 119.4, "basic_2step": 50.54, "sequential_3step": 135.04, "conditional_routing": 199.34, "sequential_reasoning": 128.63, "error_recovery": 133.24, "data_gap_recovery": 229.53, "data_gap_recovery_extended": 349.93, "argument_transformation": 331.24, "grounded_synthesis": 763.37, "inconsistent_api_recovery": 651.1, "relevance_detection_stateful": 25.27, "argument_fidelity_stateful": 98.38, "tool_selection_stateful": 121.62, "basic_2step_stateful": 55.07, "sequential_3step_stateful": 9.64, "conditional_routing_stateful": 194.99, "sequential_reasoning_stateful": 143.1, "error_recovery_stateful": 118.16, "data_gap_recovery_stateful": 215.47, "data_gap_recovery_extended_stateful": 316.85, "argument_transformation_stateful": 192.9, "grounded_synthesis_stateful": 779.1, "inconsistent_api_recovery_stateful": 641.26}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 18, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 1, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q4_K_M LS/P [reforged]", "model": "Meta-Llama-3.1-8B-Instruct-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 52.8, "accuracy": 59.5, "completeness": 88.6, "efficiency": 75.2, "wasted": 1.9, "speed": 1.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 96, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 90, "sequential_reasoning": 10, "error_recovery": 28, "data_gap_recovery": 36, "data_gap_recovery_extended": 22, "argument_transformation": 0, "grounded_synthesis": 30, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 96, "tool_selection_stateful": 98, "basic_2step_stateful": 100, "sequential_3step_stateful": 4, "conditional_routing_stateful": 84, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 30, "data_gap_recovery_stateful": 24, "data_gap_recovery_extended_stateful": 22, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 30}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 48, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 45, "sequential_reasoning": 5, "error_recovery": 14, "data_gap_recovery": 18, "data_gap_recovery_extended": 11, "argument_transformation": 0, "grounded_synthesis": 15, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 48, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 2, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 15, "data_gap_recovery_stateful": 12, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 15}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 7, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 7, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 144, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 180, "sequential_reasoning": 20, "error_recovery": 28, "data_gap_recovery": 90, "data_gap_recovery_extended": 88, "argument_transformation": 0, "grounded_synthesis": 150, "inconsistent_api_recovery": 192, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 144, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 6, "conditional_routing_stateful": 168, "sequential_reasoning_stateful": 8, "error_recovery_stateful": 45, "data_gap_recovery_stateful": 60, "data_gap_recovery_extended_stateful": 88, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 120}, "scenarioActualCalls": {"relevance_detection": 66, "argument_fidelity": 149, "tool_selection": 130, "basic_2step": 100, "sequential_3step": 199, "conditional_routing": 257, "sequential_reasoning": 17, "error_recovery": 112, "data_gap_recovery": 84, "data_gap_recovery_extended": 67, "argument_transformation": 0, "grounded_synthesis": 268, "inconsistent_api_recovery": 349, "relevance_detection_stateful": 67, "argument_fidelity_stateful": 151, "tool_selection_stateful": 164, "basic_2step_stateful": 102, "sequential_3step_stateful": 6, "conditional_routing_stateful": 240, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 87, "data_gap_recovery_stateful": 63, "data_gap_recovery_extended_stateful": 82, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 175, "inconsistent_api_recovery_stateful": 218}, "scenarioWastedSum": {"relevance_detection": 16.0, "argument_fidelity": 5.0, "tool_selection": 9.0, "basic_2step": 0.0, "sequential_3step": 49.0, "conditional_routing": 85.0, "sequential_reasoning": 38.0, "error_recovery": 266.0, "data_gap_recovery": 40.0, "data_gap_recovery_extended": 13.0, "argument_transformation": 13.0, "grounded_synthesis": 302.0, "inconsistent_api_recovery": 307.0, "relevance_detection_stateful": 17.0, "argument_fidelity_stateful": 7.0, "tool_selection_stateful": 20.0, "basic_2step_stateful": 2.0, "sequential_3step_stateful": 47.0, "conditional_routing_stateful": 92.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 129.0, "data_gap_recovery_stateful": 42.0, "data_gap_recovery_extended_stateful": 22.0, "argument_transformation_stateful": 19.0, "grounded_synthesis_stateful": 294.0, "inconsistent_api_recovery_stateful": 334.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 7, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 12.42, "argument_fidelity": 33.08, "tool_selection": 33.26, "basic_2step": 18.04, "sequential_3step": 45.95, "conditional_routing": 82.89, "sequential_reasoning": 49.08, "error_recovery": 53.31, "data_gap_recovery": 99.74, "data_gap_recovery_extended": 125.28, "argument_transformation": 40.49, "grounded_synthesis": 287.87, "inconsistent_api_recovery": 246.95, "relevance_detection_stateful": 12.5, "argument_fidelity_stateful": 34.3, "tool_selection_stateful": 45.23, "basic_2step_stateful": 20.4, "sequential_3step_stateful": 12.04, "conditional_routing_stateful": 89.67, "sequential_reasoning_stateful": 53.14, "error_recovery_stateful": 42.68, "data_gap_recovery_stateful": 94.45, "data_gap_recovery_extended_stateful": 137.76, "argument_transformation_stateful": 25.31, "grounded_synthesis_stateful": 284.96, "inconsistent_api_recovery_stateful": 248.82}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 7, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-14B-Q4_K_M LS/P [bare]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 53.5, "accuracy": 62.7, "completeness": 85.3, "efficiency": 95.8, "wasted": 0.2, "speed": 22.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 18, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 92, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 60, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 16, "basic_2step_stateful": 84, "sequential_3step_stateful": 100, "conditional_routing_stateful": 56, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 62, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 6}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 9, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 46, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 30, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 17, "inconsistent_api_recovery": 25, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 8, "basic_2step_stateful": 42, "sequential_3step_stateful": 50, "conditional_routing_stateful": 28, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 31, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 3}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 9, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 8, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 9, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 8, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 27, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 184, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 170, "inconsistent_api_recovery": 200, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 24, "basic_2step_stateful": 84, "sequential_3step_stateful": 150, "conditional_routing_stateful": 112, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 155, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 24}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 27, "basic_2step": 91, "sequential_3step": 150, "conditional_routing": 190, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 144, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 218, "inconsistent_api_recovery": 217, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 24, "basic_2step_stateful": 84, "sequential_3step_stateful": 150, "conditional_routing_stateful": 138, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 148, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 70, "inconsistent_api_recovery_stateful": 34}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 15.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 75.0, "inconsistent_api_recovery": 26.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 26.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 70.0, "inconsistent_api_recovery_stateful": 34.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 9, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 8, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 76.67, "argument_fidelity": 302.4, "tool_selection": 56.04, "basic_2step": 133.31, "sequential_3step": 529.23, "conditional_routing": 920.8, "sequential_reasoning": 472.7, "error_recovery": 0.0, "data_gap_recovery": 822.53, "data_gap_recovery_extended": 1213.16, "argument_transformation": 2357.75, "grounded_synthesis": 3539.74, "inconsistent_api_recovery": 1895.27, "relevance_detection_stateful": 77.93, "argument_fidelity_stateful": 306.95, "tool_selection_stateful": 52.05, "basic_2step_stateful": 137.52, "sequential_3step_stateful": 473.5, "conditional_routing_stateful": 1065.04, "sequential_reasoning_stateful": 467.02, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 820.12, "data_gap_recovery_extended_stateful": 1238.33, "argument_transformation_stateful": 2386.99, "grounded_synthesis_stateful": 3356.77, "inconsistent_api_recovery_stateful": 2201.25}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 9, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 8, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-Nemo-Instruct-2407-Q4_K_M LS/P [bare]", "model": "Mistral-Nemo-Instruct-2407-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 53.1, "accuracy": 66.0, "completeness": 80.5, "efficiency": 99.8, "wasted": 0.5, "speed": 2.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 78, "basic_2step": 100, "sequential_3step": 0, "conditional_routing": 100, "sequential_reasoning": 62, "error_recovery": 0, "data_gap_recovery": 94, "data_gap_recovery_extended": 76, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 82, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 34, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 90, "data_gap_recovery_extended_stateful": 64, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 31, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 38, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 41, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 17, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 37, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 41, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 30, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 37, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 41, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 30, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 117, "basic_2step": 100, "sequential_3step": 0, "conditional_routing": 200, "sequential_reasoning": 124, "error_recovery": 0, "data_gap_recovery": 235, "data_gap_recovery_extended": 304, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 123, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 68, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 256, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 159, "tool_selection": 117, "basic_2step": 100, "sequential_3step": 0, "conditional_routing": 249, "sequential_reasoning": 316, "error_recovery": 0, "data_gap_recovery": 187, "data_gap_recovery_extended": 133, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 159, "tool_selection_stateful": 123, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 166, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 184, "data_gap_recovery_extended_stateful": 114, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 9.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 49.0, "sequential_reasoning": 221.0, "error_recovery": 0.0, "data_gap_recovery": 8.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 19.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 9.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 164.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 16.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 11.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 37, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 41, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 30, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 18.45, "argument_fidelity": 48.01, "tool_selection": 47.61, "basic_2step": 30.5, "sequential_3step": 0.0, "conditional_routing": 131.75, "sequential_reasoning": 119.16, "error_recovery": 0.0, "data_gap_recovery": 157.53, "data_gap_recovery_extended": 198.56, "argument_transformation": 225.07, "grounded_synthesis": 189.47, "inconsistent_api_recovery": 174.53, "relevance_detection_stateful": 18.76, "argument_fidelity_stateful": 48.29, "tool_selection_stateful": 51.13, "basic_2step_stateful": 33.46, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 128.13, "sequential_reasoning_stateful": 90.94, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 156.64, "data_gap_recovery_extended_stateful": 192.9, "argument_transformation_stateful": 223.09, "grounded_synthesis_stateful": 194.53, "inconsistent_api_recovery_stateful": 183.68}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 37, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 41, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 30, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.6-35B-A3B-UD-Q4_K_M LS/N [bare]", "model": "Qwen3.6-35B-A3B-UD-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "qwen3.6-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 52.8, "accuracy": 88.4, "completeness": 59.8, "efficiency": 100.0, "wasted": 0.1, "speed": 11.8, "n": 50, "scenarios": {"relevance_detection": 14, "argument_fidelity": 72, "tool_selection": 2, "basic_2step": 92, "sequential_3step": 68, "conditional_routing": 92, "sequential_reasoning": 42, "error_recovery": 0, "data_gap_recovery": 98, "data_gap_recovery_extended": 58, "argument_transformation": 66, "grounded_synthesis": 52, "inconsistent_api_recovery": 90, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 80, "tool_selection_stateful": 4, "basic_2step_stateful": 86, "sequential_3step_stateful": 56, "conditional_routing_stateful": 90, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 90, "data_gap_recovery_extended_stateful": 70, "argument_transformation_stateful": 56, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 7, "argument_fidelity": 36, "tool_selection": 1, "basic_2step": 46, "sequential_3step": 34, "conditional_routing": 46, "sequential_reasoning": 21, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 29, "argument_transformation": 33, "grounded_synthesis": 26, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 40, "tool_selection_stateful": 2, "basic_2step_stateful": 43, "sequential_3step_stateful": 28, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 35, "argument_transformation_stateful": 28, "grounded_synthesis_stateful": 23, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 7, "argument_fidelity": 36, "tool_selection": 1, "basic_2step": 46, "sequential_3step": 34, "conditional_routing": 46, "sequential_reasoning": 23, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 45, "argument_transformation": 39, "grounded_synthesis": 28, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 40, "tool_selection_stateful": 2, "basic_2step_stateful": 43, "sequential_3step_stateful": 28, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 42}, "scenarioValidated": {"relevance_detection": 7, "argument_fidelity": 36, "tool_selection": 1, "basic_2step": 46, "sequential_3step": 34, "conditional_routing": 46, "sequential_reasoning": 23, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 45, "argument_transformation": 39, "grounded_synthesis": 28, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 40, "tool_selection_stateful": 2, "basic_2step_stateful": 43, "sequential_3step_stateful": 28, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 42}, "scenarioIdealCalls": {"relevance_detection": 7, "argument_fidelity": 108, "tool_selection": 3, "basic_2step": 92, "sequential_3step": 102, "conditional_routing": 184, "sequential_reasoning": 84, "error_recovery": 0, "data_gap_recovery": 245, "data_gap_recovery_extended": 232, "argument_transformation": 165, "grounded_synthesis": 260, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 120, "tool_selection_stateful": 6, "basic_2step_stateful": 86, "sequential_3step_stateful": 84, "conditional_routing_stateful": 180, "sequential_reasoning_stateful": 80, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 280, "argument_transformation_stateful": 140, "grounded_synthesis_stateful": 230, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 7, "argument_fidelity": 108, "tool_selection": 3, "basic_2step": 92, "sequential_3step": 102, "conditional_routing": 134, "sequential_reasoning": 84, "error_recovery": 0, "data_gap_recovery": 169, "data_gap_recovery_extended": 123, "argument_transformation": 132, "grounded_synthesis": 196, "inconsistent_api_recovery": 240, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 120, "tool_selection_stateful": 6, "basic_2step_stateful": 86, "sequential_3step_stateful": 84, "conditional_routing_stateful": 121, "sequential_reasoning_stateful": 80, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 151, "data_gap_recovery_extended_stateful": 132, "argument_transformation_stateful": 117, "grounded_synthesis_stateful": 142, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 9.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 49.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 6.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 32.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 7, "argument_fidelity": 36, "tool_selection": 1, "basic_2step": 46, "sequential_3step": 34, "conditional_routing": 46, "sequential_reasoning": 23, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 45, "argument_transformation": 39, "grounded_synthesis": 28, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 40, "tool_selection_stateful": 2, "basic_2step_stateful": 43, "sequential_3step_stateful": 28, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 42}, "scenarioSpeedSum": {"relevance_detection": 47.07, "argument_fidelity": 126.97, "tool_selection": 2.92, "basic_2step": 139.07, "sequential_3step": 142.39, "conditional_routing": 443.97, "sequential_reasoning": 148.26, "error_recovery": 0.0, "data_gap_recovery": 490.41, "data_gap_recovery_extended": 480.18, "argument_transformation": 1264.45, "grounded_synthesis": 807.42, "inconsistent_api_recovery": 725.27, "relevance_detection_stateful": 32.83, "argument_fidelity_stateful": 127.91, "tool_selection_stateful": 5.59, "basic_2step_stateful": 122.27, "sequential_3step_stateful": 127.44, "conditional_routing_stateful": 446.13, "sequential_reasoning_stateful": 108.78, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 438.8, "data_gap_recovery_extended_stateful": 483.18, "argument_transformation_stateful": 1081.75, "grounded_synthesis_stateful": 723.53, "inconsistent_api_recovery_stateful": 681.41}, "scenarioSpeedN": {"relevance_detection": 7, "argument_fidelity": 36, "tool_selection": 1, "basic_2step": 46, "sequential_3step": 34, "conditional_routing": 46, "sequential_reasoning": 23, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 45, "argument_transformation": 39, "grounded_synthesis": 28, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 40, "tool_selection_stateful": 2, "basic_2step_stateful": 43, "sequential_3step_stateful": 28, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 42}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q4_K_M LS/N [reforged]", "model": "Meta-Llama-3.1-8B-Instruct-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 51.8, "accuracy": 52.8, "completeness": 98.0, "efficiency": 75.1, "wasted": 1.6, "speed": 1.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 92, "basic_2step": 100, "sequential_3step": 80, "conditional_routing": 72, "sequential_reasoning": 66, "error_recovery": 76, "data_gap_recovery": 20, "data_gap_recovery_extended": 2, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 94, "basic_2step_stateful": 98, "sequential_3step_stateful": 4, "conditional_routing_stateful": 72, "sequential_reasoning_stateful": 54, "error_recovery_stateful": 80, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 40, "conditional_routing": 36, "sequential_reasoning": 33, "error_recovery": 38, "data_gap_recovery": 10, "data_gap_recovery_extended": 1, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 49, "sequential_3step_stateful": 2, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 27, "error_recovery_stateful": 40, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 138, "basic_2step": 100, "sequential_3step": 120, "conditional_routing": 144, "sequential_reasoning": 132, "error_recovery": 76, "data_gap_recovery": 50, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 64, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 141, "basic_2step_stateful": 98, "sequential_3step_stateful": 6, "conditional_routing_stateful": 144, "sequential_reasoning_stateful": 108, "error_recovery_stateful": 120, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 151, "tool_selection": 193, "basic_2step": 100, "sequential_3step": 298, "conditional_routing": 191, "sequential_reasoning": 174, "error_recovery": 114, "data_gap_recovery": 60, "data_gap_recovery_extended": 11, "argument_transformation": 0, "grounded_synthesis": 15, "inconsistent_api_recovery": 131, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 219, "basic_2step_stateful": 98, "sequential_3step_stateful": 17, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 135, "error_recovery_stateful": 120, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 17, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 32}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 1.0, "tool_selection": 71.0, "basic_2step": 0.0, "sequential_3step": 235.0, "conditional_routing": 56.0, "sequential_reasoning": 86.0, "error_recovery": 50.0, "data_gap_recovery": 14.0, "data_gap_recovery_extended": 32.0, "argument_transformation": 33.0, "grounded_synthesis": 177.0, "inconsistent_api_recovery": 263.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 86.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 171.0, "conditional_routing_stateful": 64.0, "sequential_reasoning_stateful": 63.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 14.0, "data_gap_recovery_extended_stateful": 46.0, "argument_transformation_stateful": 78.0, "grounded_synthesis_stateful": 193.0, "inconsistent_api_recovery_stateful": 296.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioSpeedSum": {"relevance_detection": 10.73, "argument_fidelity": 43.08, "tool_selection": 48.31, "basic_2step": 18.13, "sequential_3step": 84.39, "conditional_routing": 95.7, "sequential_reasoning": 74.93, "error_recovery": 27.56, "data_gap_recovery": 67.6, "data_gap_recovery_extended": 88.58, "argument_transformation": 65.29, "grounded_synthesis": 189.11, "inconsistent_api_recovery": 223.29, "relevance_detection_stateful": 11.09, "argument_fidelity_stateful": 42.41, "tool_selection_stateful": 58.0, "basic_2step_stateful": 20.61, "sequential_3step_stateful": 62.64, "conditional_routing_stateful": 102.69, "sequential_reasoning_stateful": 103.33, "error_recovery_stateful": 27.64, "data_gap_recovery_stateful": 62.42, "data_gap_recovery_extended_stateful": 98.9, "argument_transformation_stateful": 83.69, "grounded_synthesis_stateful": 188.03, "inconsistent_api_recovery_stateful": 248.23}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}}, {"label": "Meta-Llama-3.1-8B-Instruct.Q4_K_M LF/P [reforged]", "model": "Meta-Llama-3.1-8B-Instruct.Q4_K_M", "backend": "llamafile", "mode": "prompt", "ablation": "reforged", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 51.2, "accuracy": 57.2, "completeness": 89.5, "efficiency": 76.3, "wasted": 1.8, "speed": 3.5, "n": 50, "scenarios": {"relevance_detection": 98, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 86, "sequential_reasoning": 10, "error_recovery": 24, "data_gap_recovery": 18, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 32, "inconsistent_api_recovery": 56, "relevance_detection_stateful": 98, "argument_fidelity_stateful": 98, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 26, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 40}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 5, "error_recovery": 12, "data_gap_recovery": 9, "data_gap_recovery_extended": 4, "argument_transformation": 0, "grounded_synthesis": 16, "inconsistent_api_recovery": 28, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 13, "data_gap_recovery_stateful": 8, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 20}, "scenarioCompleted": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 15, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 13, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 9, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 15, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 13, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 9, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 172, "sequential_reasoning": 20, "error_recovery": 24, "data_gap_recovery": 45, "data_gap_recovery_extended": 32, "argument_transformation": 0, "grounded_synthesis": 160, "inconsistent_api_recovery": 224, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 147, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 39, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 80, "inconsistent_api_recovery_stateful": 160}, "scenarioActualCalls": {"relevance_detection": 52, "argument_fidelity": 150, "tool_selection": 168, "basic_2step": 100, "sequential_3step": 176, "conditional_routing": 208, "sequential_reasoning": 25, "error_recovery": 52, "data_gap_recovery": 63, "data_gap_recovery_extended": 32, "argument_transformation": 0, "grounded_synthesis": 274, "inconsistent_api_recovery": 406, "relevance_detection_stateful": 54, "argument_fidelity_stateful": 150, "tool_selection_stateful": 171, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 244, "sequential_reasoning_stateful": 5, "error_recovery_stateful": 56, "data_gap_recovery_stateful": 35, "data_gap_recovery_extended_stateful": 37, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 128, "inconsistent_api_recovery_stateful": 288}, "scenarioWastedSum": {"relevance_detection": 3.0, "argument_fidelity": 0.0, "tool_selection": 24.0, "basic_2step": 0.0, "sequential_3step": 26.0, "conditional_routing": 57.0, "sequential_reasoning": 41.0, "error_recovery": 118.0, "data_gap_recovery": 69.0, "data_gap_recovery_extended": 43.0, "argument_transformation": 102.0, "grounded_synthesis": 238.0, "inconsistent_api_recovery": 321.0, "relevance_detection_stateful": 5.0, "argument_fidelity_stateful": 3.0, "tool_selection_stateful": 21.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 90.0, "conditional_routing_stateful": 60.0, "sequential_reasoning_stateful": 67.0, "error_recovery_stateful": 73.0, "data_gap_recovery_stateful": 41.0, "data_gap_recovery_extended_stateful": 45.0, "argument_transformation_stateful": 42.0, "grounded_synthesis_stateful": 284.0, "inconsistent_api_recovery_stateful": 299.0}, "scenarioWastedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 15, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 13, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 9, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 21.52, "argument_fidelity": 64.5, "tool_selection": 75.96, "basic_2step": 38.86, "sequential_3step": 82.83, "conditional_routing": 135.54, "sequential_reasoning": 94.03, "error_recovery": 75.85, "data_gap_recovery": 160.55, "data_gap_recovery_extended": 202.33, "argument_transformation": 191.79, "grounded_synthesis": 426.77, "inconsistent_api_recovery": 474.59, "relevance_detection_stateful": 22.48, "argument_fidelity_stateful": 68.23, "tool_selection_stateful": 81.61, "basic_2step_stateful": 43.49, "sequential_3step_stateful": 46.1, "conditional_routing_stateful": 139.46, "sequential_reasoning_stateful": 103.96, "error_recovery_stateful": 76.47, "data_gap_recovery_stateful": 128.97, "data_gap_recovery_extended_stateful": 202.33, "argument_transformation_stateful": 110.03, "grounded_synthesis_stateful": 499.22, "inconsistent_api_recovery_stateful": 474.31}, "scenarioSpeedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 15, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 13, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 9, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Instruct-2512-Q8_0 LS/P [bare]", "model": "Ministral-3-8B-Instruct-2512-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "ministral-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 50.2, "accuracy": 82.2, "completeness": 61.0, "efficiency": 100.0, "wasted": 0.2, "speed": 2.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 90, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 0, "inconsistent_api_recovery": 70, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 6, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 44}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 0, "inconsistent_api_recovery": 35, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 3, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 22}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 3, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 3, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 225, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 0, "inconsistent_api_recovery": 280, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 12, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 176}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 135, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 0, "inconsistent_api_recovery": 285, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 12, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 132, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 132}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 25.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 28.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 3, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 16.71, "argument_fidelity": 65.23, "tool_selection": 0.0, "basic_2step": 29.45, "sequential_3step": 89.63, "conditional_routing": 156.04, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 173.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 254.03, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 239.03, "relevance_detection_stateful": 16.72, "argument_fidelity_stateful": 64.58, "tool_selection_stateful": 0.0, "basic_2step_stateful": 32.61, "sequential_3step_stateful": 88.7, "conditional_routing_stateful": 153.16, "sequential_reasoning_stateful": 5.71, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 169.29, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 282.97, "grounded_synthesis_stateful": 12.04, "inconsistent_api_recovery_stateful": 238.35}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 3, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 50}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q8_0 LS/N [reforged]", "model": "Meta-Llama-3.1-8B-Instruct-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 48.8, "accuracy": 51.8, "completeness": 94.2, "efficiency": 71.2, "wasted": 1.8, "speed": 2.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 86, "sequential_3step": 92, "conditional_routing": 64, "sequential_reasoning": 68, "error_recovery": 38, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 30, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 94, "basic_2step_stateful": 92, "sequential_3step_stateful": 16, "conditional_routing_stateful": 52, "sequential_reasoning_stateful": 36, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 14, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 38}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 43, "sequential_3step": 46, "conditional_routing": 32, "sequential_reasoning": 34, "error_recovery": 19, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 15, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 46, "sequential_3step_stateful": 8, "conditional_routing_stateful": 26, "sequential_reasoning_stateful": 18, "error_recovery_stateful": 21, "data_gap_recovery_stateful": 7, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 19}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 50, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 39, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 50, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 39, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 86, "sequential_3step": 138, "conditional_routing": 128, "sequential_reasoning": 136, "error_recovery": 38, "data_gap_recovery": 10, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 120, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 141, "basic_2step_stateful": 92, "sequential_3step_stateful": 24, "conditional_routing_stateful": 104, "sequential_reasoning_stateful": 72, "error_recovery_stateful": 63, "data_gap_recovery_stateful": 35, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 152}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 151, "tool_selection": 185, "basic_2step": 90, "sequential_3step": 326, "conditional_routing": 170, "sequential_reasoning": 216, "error_recovery": 58, "data_gap_recovery": 8, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 245, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 152, "tool_selection_stateful": 159, "basic_2step_stateful": 95, "sequential_3step_stateful": 47, "conditional_routing_stateful": 142, "sequential_reasoning_stateful": 106, "error_recovery_stateful": 70, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 305}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 1.0, "tool_selection": 35.0, "basic_2step": 11.0, "sequential_3step": 197.0, "conditional_routing": 55.0, "sequential_reasoning": 89.0, "error_recovery": 54.0, "data_gap_recovery": 14.0, "data_gap_recovery_extended": 17.0, "argument_transformation": 188.0, "grounded_synthesis": 155.0, "inconsistent_api_recovery": 288.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 2.0, "tool_selection_stateful": 30.0, "basic_2step_stateful": 7.0, "sequential_3step_stateful": 239.0, "conditional_routing_stateful": 56.0, "sequential_reasoning_stateful": 52.0, "error_recovery_stateful": 10.0, "data_gap_recovery_stateful": 22.0, "data_gap_recovery_extended_stateful": 20.0, "argument_transformation_stateful": 156.0, "grounded_synthesis_stateful": 192.0, "inconsistent_api_recovery_stateful": 302.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 50, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 39, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}, "scenarioSpeedSum": {"relevance_detection": 15.33, "argument_fidelity": 64.2, "tool_selection": 67.45, "basic_2step": 30.5, "sequential_3step": 121.19, "conditional_routing": 128.53, "sequential_reasoning": 122.07, "error_recovery": 42.98, "data_gap_recovery": 101.41, "data_gap_recovery_extended": 118.43, "argument_transformation": 139.21, "grounded_synthesis": 261.56, "inconsistent_api_recovery": 318.68, "relevance_detection_stateful": 16.02, "argument_fidelity_stateful": 62.72, "tool_selection_stateful": 69.15, "basic_2step_stateful": 31.94, "sequential_3step_stateful": 110.39, "conditional_routing_stateful": 119.38, "sequential_reasoning_stateful": 97.4, "error_recovery_stateful": 43.83, "data_gap_recovery_stateful": 111.6, "data_gap_recovery_extended_stateful": 122.69, "argument_transformation_stateful": 123.85, "grounded_synthesis_stateful": 272.46, "inconsistent_api_recovery_stateful": 326.8}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 50, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 39, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}}, {"label": "qwen3:8b-q8_0 OL/N [bare]", "model": "qwen3:8b-q8_0", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "qwen3-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 47.3, "accuracy": 56.8, "completeness": 83.3, "efficiency": 95.8, "wasted": 0.1, "speed": 24.0, "n": 50, "scenarios": {"relevance_detection": 86, "argument_fidelity": 100, "tool_selection": 2, "basic_2step": 34, "sequential_3step": 92, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 64, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 6, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 88, "argument_fidelity_stateful": 98, "tool_selection_stateful": 2, "basic_2step_stateful": 68, "sequential_3step_stateful": 92, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 84, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 43, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 17, "sequential_3step": 46, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 32, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 3, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 49, "tool_selection_stateful": 1, "basic_2step_stateful": 34, "sequential_3step_stateful": 46, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 43, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 17, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 39, "data_gap_recovery_extended": 49, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 49, "tool_selection_stateful": 1, "basic_2step_stateful": 35, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 43, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 17, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 39, "data_gap_recovery_extended": 49, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 49, "tool_selection_stateful": 1, "basic_2step_stateful": 35, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 43, "argument_fidelity": 150, "tool_selection": 3, "basic_2step": 34, "sequential_3step": 138, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 160, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 30, "inconsistent_api_recovery": 64, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 147, "tool_selection_stateful": 3, "basic_2step_stateful": 68, "sequential_3step_stateful": 138, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 210, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 43, "argument_fidelity": 150, "tool_selection": 3, "basic_2step": 34, "sequential_3step": 138, "conditional_routing": 243, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 153, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 23, "inconsistent_api_recovery": 76, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 147, "tool_selection_stateful": 3, "basic_2step_stateful": 68, "sequential_3step_stateful": 138, "conditional_routing_stateful": 245, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 205, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 22}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 45.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 14.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 49.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 19.0}, "scenarioWastedN": {"relevance_detection": 43, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 17, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 39, "data_gap_recovery_extended": 49, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 49, "tool_selection_stateful": 1, "basic_2step_stateful": 35, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 134.13, "argument_fidelity": 379.45, "tool_selection": 8.78, "basic_2step": 73.93, "sequential_3step": 636.21, "conditional_routing": 1430.82, "sequential_reasoning": 512.68, "error_recovery": 175.78, "data_gap_recovery": 769.24, "data_gap_recovery_extended": 1745.51, "argument_transformation": 2307.53, "grounded_synthesis": 2670.9, "inconsistent_api_recovery": 1783.53, "relevance_detection_stateful": 135.46, "argument_fidelity_stateful": 407.11, "tool_selection_stateful": 9.18, "basic_2step_stateful": 195.02, "sequential_3step_stateful": 723.4, "conditional_routing_stateful": 1571.28, "sequential_reasoning_stateful": 552.39, "error_recovery_stateful": 144.63, "data_gap_recovery_stateful": 903.44, "data_gap_recovery_extended_stateful": 1803.24, "argument_transformation_stateful": 2614.36, "grounded_synthesis_stateful": 2433.41, "inconsistent_api_recovery_stateful": 1860.26}, "scenarioSpeedN": {"relevance_detection": 43, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 17, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 39, "data_gap_recovery_extended": 49, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 49, "tool_selection_stateful": 1, "basic_2step_stateful": 35, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q8_0 LS/N [bare]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "qwen3-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 46.6, "accuracy": 64.0, "completeness": 72.8, "efficiency": 100.0, "wasted": 0.1, "speed": 20.5, "n": 50, "scenarios": {"relevance_detection": 80, "argument_fidelity": 76, "tool_selection": 0, "basic_2step": 86, "sequential_3step": 94, "conditional_routing": 94, "sequential_reasoning": 82, "error_recovery": 0, "data_gap_recovery": 74, "data_gap_recovery_extended": 0, "argument_transformation": 14, "grounded_synthesis": 16, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 88, "argument_fidelity_stateful": 70, "tool_selection_stateful": 0, "basic_2step_stateful": 88, "sequential_3step_stateful": 84, "conditional_routing_stateful": 94, "sequential_reasoning_stateful": 76, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 66, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 18, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 40, "argument_fidelity": 38, "tool_selection": 0, "basic_2step": 43, "sequential_3step": 47, "conditional_routing": 47, "sequential_reasoning": 41, "error_recovery": 0, "data_gap_recovery": 37, "data_gap_recovery_extended": 0, "argument_transformation": 7, "grounded_synthesis": 8, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 35, "tool_selection_stateful": 0, "basic_2step_stateful": 44, "sequential_3step_stateful": 42, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 38, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 33, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 9, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 40, "argument_fidelity": 38, "tool_selection": 0, "basic_2step": 43, "sequential_3step": 49, "conditional_routing": 48, "sequential_reasoning": 41, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 48, "argument_transformation": 31, "grounded_synthesis": 39, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 35, "tool_selection_stateful": 0, "basic_2step_stateful": 44, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 38, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 27, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 40, "argument_fidelity": 38, "tool_selection": 0, "basic_2step": 43, "sequential_3step": 49, "conditional_routing": 48, "sequential_reasoning": 41, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 48, "argument_transformation": 31, "grounded_synthesis": 39, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 35, "tool_selection_stateful": 0, "basic_2step_stateful": 44, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 38, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 27, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 40, "argument_fidelity": 114, "tool_selection": 0, "basic_2step": 86, "sequential_3step": 141, "conditional_routing": 188, "sequential_reasoning": 164, "error_recovery": 0, "data_gap_recovery": 185, "data_gap_recovery_extended": 0, "argument_transformation": 35, "grounded_synthesis": 80, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 105, "tool_selection_stateful": 0, "basic_2step_stateful": 88, "sequential_3step_stateful": 126, "conditional_routing_stateful": 188, "sequential_reasoning_stateful": 152, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 165, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 90, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 40, "argument_fidelity": 114, "tool_selection": 0, "basic_2step": 86, "sequential_3step": 141, "conditional_routing": 232, "sequential_reasoning": 164, "error_recovery": 0, "data_gap_recovery": 161, "data_gap_recovery_extended": 0, "argument_transformation": 38, "grounded_synthesis": 46, "inconsistent_api_recovery": 14, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 105, "tool_selection_stateful": 0, "basic_2step_stateful": 88, "sequential_3step_stateful": 126, "conditional_routing_stateful": 235, "sequential_reasoning_stateful": 152, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 148, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 53, "inconsistent_api_recovery_stateful": 9}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 45.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 9.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 11.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 2.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 47.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 12.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 1.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 40, "argument_fidelity": 38, "tool_selection": 0, "basic_2step": 43, "sequential_3step": 49, "conditional_routing": 48, "sequential_reasoning": 41, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 48, "argument_transformation": 31, "grounded_synthesis": 39, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 35, "tool_selection_stateful": 0, "basic_2step_stateful": 44, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 38, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 27, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 117.66, "argument_fidelity": 354.25, "tool_selection": 0.0, "basic_2step": 200.64, "sequential_3step": 698.56, "conditional_routing": 1286.29, "sequential_reasoning": 644.1, "error_recovery": 0.0, "data_gap_recovery": 979.83, "data_gap_recovery_extended": 1262.57, "argument_transformation": 1441.46, "grounded_synthesis": 1637.23, "inconsistent_api_recovery": 1125.96, "relevance_detection_stateful": 142.91, "argument_fidelity_stateful": 327.18, "tool_selection_stateful": 0.0, "basic_2step_stateful": 232.21, "sequential_3step_stateful": 663.04, "conditional_routing_stateful": 1333.82, "sequential_reasoning_stateful": 643.69, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1053.67, "data_gap_recovery_extended_stateful": 1151.1, "argument_transformation_stateful": 1276.29, "grounded_synthesis_stateful": 1732.97, "inconsistent_api_recovery_stateful": 1119.34}, "scenarioSpeedN": {"relevance_detection": 40, "argument_fidelity": 38, "tool_selection": 0, "basic_2step": 43, "sequential_3step": 49, "conditional_routing": 48, "sequential_reasoning": 41, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 48, "argument_transformation": 31, "grounded_synthesis": 39, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 35, "tool_selection_stateful": 0, "basic_2step_stateful": 44, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 38, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 27, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-haiku-4-5-20251001 AN/N [bare]", "model": "claude-haiku-4-5-20251001", "backend": "anthropic", "mode": "native", "ablation": "bare", "family": "claude", "quant": "n/a", "gen": 1, "retired": false, "score": 46.5, "accuracy": 86.2, "completeness": 54.0, "efficiency": 100.0, "wasted": 0.0, "speed": 8.2, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 96, "tool_selection": 100, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 4, "argument_transformation": 74, "grounded_synthesis": 100, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 96, "tool_selection_stateful": 96, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 6, "argument_transformation_stateful": 36, "grounded_synthesis_stateful": 98, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 48, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 2, "argument_transformation": 37, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 48, "tool_selection_stateful": 48, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 18, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 48, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 2, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 48, "tool_selection_stateful": 48, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 48, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 2, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 48, "tool_selection_stateful": 48, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 144, "tool_selection": 150, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 16, "argument_transformation": 185, "grounded_synthesis": 500, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 144, "tool_selection_stateful": 144, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 90, "grounded_synthesis_stateful": 490, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 144, "tool_selection": 150, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 6, "data_gap_recovery_extended": 8, "argument_transformation": 119, "grounded_synthesis": 161, "inconsistent_api_recovery": 261, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 144, "tool_selection_stateful": 144, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 60, "grounded_synthesis_stateful": 165, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 48, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 2, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 48, "tool_selection_stateful": 48, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 218.07, "tool_selection": 239.57, "basic_2step": 0.0, "sequential_3step": 161.42, "conditional_routing": 365.53, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 10.48, "data_gap_recovery_extended": 38.69, "argument_transformation": 516.23, "grounded_synthesis": 728.04, "inconsistent_api_recovery": 478.11, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 146.54, "tool_selection_stateful": 138.87, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 224.29, "conditional_routing_stateful": 451.53, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 38.46, "argument_transformation_stateful": 547.11, "grounded_synthesis_stateful": 848.56, "inconsistent_api_recovery_stateful": 594.99}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 48, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 2, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 48, "tool_selection_stateful": 48, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.6-27B-Q4_K_M LS/N [bare]", "model": "Qwen3.6-27B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "qwen3.6-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 47.0, "accuracy": 92.4, "completeness": 50.8, "efficiency": 100.0, "wasted": 0.0, "speed": 26.3, "n": 50, "scenarios": {"relevance_detection": 52, "argument_fidelity": 88, "tool_selection": 82, "basic_2step": 68, "sequential_3step": 28, "conditional_routing": 42, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 72, "data_gap_recovery_extended": 26, "argument_transformation": 22, "grounded_synthesis": 62, "inconsistent_api_recovery": 22, "relevance_detection_stateful": 62, "argument_fidelity_stateful": 88, "tool_selection_stateful": 80, "basic_2step_stateful": 66, "sequential_3step_stateful": 40, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 54, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 80, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 62, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 41, "basic_2step": 34, "sequential_3step": 14, "conditional_routing": 21, "sequential_reasoning": 24, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 13, "argument_transformation": 11, "grounded_synthesis": 31, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 31, "argument_fidelity_stateful": 44, "tool_selection_stateful": 40, "basic_2step_stateful": 33, "sequential_3step_stateful": 20, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 27, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 41, "basic_2step": 34, "sequential_3step": 14, "conditional_routing": 21, "sequential_reasoning": 24, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 33, "argument_transformation": 12, "grounded_synthesis": 31, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 31, "argument_fidelity_stateful": 44, "tool_selection_stateful": 40, "basic_2step_stateful": 33, "sequential_3step_stateful": 20, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 27, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 30, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 9}, "scenarioValidated": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 41, "basic_2step": 34, "sequential_3step": 14, "conditional_routing": 21, "sequential_reasoning": 24, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 33, "argument_transformation": 12, "grounded_synthesis": 31, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 31, "argument_fidelity_stateful": 44, "tool_selection_stateful": 40, "basic_2step_stateful": 33, "sequential_3step_stateful": 20, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 27, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 30, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 9}, "scenarioIdealCalls": {"relevance_detection": 26, "argument_fidelity": 132, "tool_selection": 123, "basic_2step": 68, "sequential_3step": 42, "conditional_routing": 84, "sequential_reasoning": 96, "error_recovery": 0, "data_gap_recovery": 180, "data_gap_recovery_extended": 104, "argument_transformation": 55, "grounded_synthesis": 310, "inconsistent_api_recovery": 88, "relevance_detection_stateful": 31, "argument_fidelity_stateful": 132, "tool_selection_stateful": 120, "basic_2step_stateful": 66, "sequential_3step_stateful": 60, "conditional_routing_stateful": 84, "sequential_reasoning_stateful": 108, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 200, "data_gap_recovery_extended_stateful": 96, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 310, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 26, "argument_fidelity": 132, "tool_selection": 123, "basic_2step": 68, "sequential_3step": 42, "conditional_routing": 48, "sequential_reasoning": 96, "error_recovery": 0, "data_gap_recovery": 115, "data_gap_recovery_extended": 47, "argument_transformation": 38, "grounded_synthesis": 104, "inconsistent_api_recovery": 57, "relevance_detection_stateful": 31, "argument_fidelity_stateful": 132, "tool_selection_stateful": 120, "basic_2step_stateful": 66, "sequential_3step_stateful": 60, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 108, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 134, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 105, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 1.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 2.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 1.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 41, "basic_2step": 34, "sequential_3step": 14, "conditional_routing": 21, "sequential_reasoning": 24, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 33, "argument_transformation": 12, "grounded_synthesis": 31, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 31, "argument_fidelity_stateful": 44, "tool_selection_stateful": 40, "basic_2step_stateful": 33, "sequential_3step_stateful": 20, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 27, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 30, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 9}, "scenarioSpeedSum": {"relevance_detection": 835.3, "argument_fidelity": 476.07, "tool_selection": 304.38, "basic_2step": 217.69, "sequential_3step": 224.89, "conditional_routing": 576.75, "sequential_reasoning": 463.44, "error_recovery": 0.0, "data_gap_recovery": 1217.85, "data_gap_recovery_extended": 934.07, "argument_transformation": 993.66, "grounded_synthesis": 1928.32, "inconsistent_api_recovery": 571.12, "relevance_detection_stateful": 959.43, "argument_fidelity_stateful": 470.08, "tool_selection_stateful": 280.4, "basic_2step_stateful": 216.82, "sequential_3step_stateful": 261.62, "conditional_routing_stateful": 566.35, "sequential_reasoning_stateful": 492.5, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1396.69, "data_gap_recovery_extended_stateful": 880.34, "argument_transformation_stateful": 648.32, "grounded_synthesis_stateful": 1870.55, "inconsistent_api_recovery_stateful": 568.7}, "scenarioSpeedN": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 41, "basic_2step": 34, "sequential_3step": 14, "conditional_routing": 21, "sequential_reasoning": 24, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 33, "argument_transformation": 12, "grounded_synthesis": 31, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 31, "argument_fidelity_stateful": 44, "tool_selection_stateful": 40, "basic_2step_stateful": 33, "sequential_3step_stateful": 20, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 27, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 30, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 9}}, {"label": "Mistral-7B-Instruct-v0.3.Q8_0 LF/P [reforged]", "model": "Mistral-7B-Instruct-v0.3.Q8_0", "backend": "llamafile", "mode": "prompt", "ablation": "reforged", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 45.7, "accuracy": 47.0, "completeness": 97.2, "efficiency": 89.6, "wasted": 0.7, "speed": 8.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 42, "sequential_reasoning": 64, "error_recovery": 4, "data_gap_recovery": 30, "data_gap_recovery_extended": 6, "argument_transformation": 0, "grounded_synthesis": 12, "inconsistent_api_recovery": 20, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 98, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 96, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 12, "error_recovery_stateful": 4, "data_gap_recovery_stateful": 26, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 12}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 21, "sequential_reasoning": 32, "error_recovery": 2, "data_gap_recovery": 15, "data_gap_recovery_extended": 3, "argument_transformation": 0, "grounded_synthesis": 6, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 25, "sequential_reasoning_stateful": 6, "error_recovery_stateful": 2, "data_gap_recovery_stateful": 13, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 6}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 43, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 43, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 84, "sequential_reasoning": 128, "error_recovery": 4, "data_gap_recovery": 75, "data_gap_recovery_extended": 24, "argument_transformation": 0, "grounded_synthesis": 60, "inconsistent_api_recovery": 80, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 24, "error_recovery_stateful": 6, "data_gap_recovery_stateful": 65, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 200, "tool_selection": 100, "basic_2step": 150, "sequential_3step": 224, "conditional_routing": 54, "sequential_reasoning": 82, "error_recovery": 8, "data_gap_recovery": 83, "data_gap_recovery_extended": 25, "argument_transformation": 0, "grounded_synthesis": 35, "inconsistent_api_recovery": 117, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 196, "tool_selection_stateful": 0, "basic_2step_stateful": 150, "sequential_3step_stateful": 197, "conditional_routing_stateful": 70, "sequential_reasoning_stateful": 29, "error_recovery_stateful": 8, "data_gap_recovery_stateful": 73, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 72}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 50.0, "tool_selection": 0.0, "basic_2step": 50.0, "sequential_3step": 76.0, "conditional_routing": 36.0, "sequential_reasoning": 10.0, "error_recovery": 103.0, "data_gap_recovery": 31.0, "data_gap_recovery_extended": 29.0, "argument_transformation": 16.0, "grounded_synthesis": 22.0, "inconsistent_api_recovery": 72.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 49.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 54.0, "conditional_routing_stateful": 39.0, "sequential_reasoning_stateful": 12.0, "error_recovery_stateful": 54.0, "data_gap_recovery_stateful": 29.0, "data_gap_recovery_extended_stateful": 12.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 48.0, "inconsistent_api_recovery_stateful": 67.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 43, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 42.78, "argument_fidelity": 172.23, "tool_selection": 75.95, "basic_2step": 90.5, "sequential_3step": 179.35, "conditional_routing": 339.16, "sequential_reasoning": 180.03, "error_recovery": 108.96, "data_gap_recovery": 433.06, "data_gap_recovery_extended": 707.71, "argument_transformation": 948.56, "grounded_synthesis": 1134.66, "inconsistent_api_recovery": 849.08, "relevance_detection_stateful": 39.5, "argument_fidelity_stateful": 167.96, "tool_selection_stateful": 76.46, "basic_2step_stateful": 104.79, "sequential_3step_stateful": 160.15, "conditional_routing_stateful": 339.01, "sequential_reasoning_stateful": 158.07, "error_recovery_stateful": 112.23, "data_gap_recovery_stateful": 413.18, "data_gap_recovery_extended_stateful": 618.05, "argument_transformation_stateful": 1017.89, "grounded_synthesis_stateful": 1135.63, "inconsistent_api_recovery_stateful": 805.87}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 43, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-7B-Instruct-v0.3-Q8_0 LS/P [reforged]", "model": "Mistral-7B-Instruct-v0.3-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 46.1, "accuracy": 47.8, "completeness": 96.5, "efficiency": 89.9, "wasted": 0.7, "speed": 4.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 98, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 42, "sequential_reasoning": 76, "error_recovery": 14, "data_gap_recovery": 28, "data_gap_recovery_extended": 4, "argument_transformation": 0, "grounded_synthesis": 16, "inconsistent_api_recovery": 22, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 86, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 18, "error_recovery_stateful": 12, "data_gap_recovery_stateful": 20, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 10}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 21, "sequential_reasoning": 38, "error_recovery": 7, "data_gap_recovery": 14, "data_gap_recovery_extended": 2, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 43, "conditional_routing_stateful": 19, "sequential_reasoning_stateful": 9, "error_recovery_stateful": 6, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 5}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 46, "argument_transformation": 49, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 46, "argument_transformation": 49, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 147, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 84, "sequential_reasoning": 152, "error_recovery": 14, "data_gap_recovery": 70, "data_gap_recovery_extended": 16, "argument_transformation": 0, "grounded_synthesis": 80, "inconsistent_api_recovery": 88, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 129, "conditional_routing_stateful": 76, "sequential_reasoning_stateful": 36, "error_recovery_stateful": 18, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 60, "inconsistent_api_recovery_stateful": 40}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 196, "tool_selection": 100, "basic_2step": 150, "sequential_3step": 218, "conditional_routing": 54, "sequential_reasoning": 104, "error_recovery": 28, "data_gap_recovery": 70, "data_gap_recovery_extended": 7, "argument_transformation": 0, "grounded_synthesis": 65, "inconsistent_api_recovery": 122, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 200, "tool_selection_stateful": 0, "basic_2step_stateful": 149, "sequential_3step_stateful": 182, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 42, "error_recovery_stateful": 24, "data_gap_recovery_stateful": 60, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 64}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 50.0, "tool_selection": 0.0, "basic_2step": 50.0, "sequential_3step": 72.0, "conditional_routing": 35.0, "sequential_reasoning": 12.0, "error_recovery": 101.0, "data_gap_recovery": 32.0, "data_gap_recovery_extended": 24.0, "argument_transformation": 19.0, "grounded_synthesis": 31.0, "inconsistent_api_recovery": 90.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 50.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 49.0, "sequential_3step_stateful": 53.0, "conditional_routing_stateful": 42.0, "sequential_reasoning_stateful": 8.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 37.0, "data_gap_recovery_extended_stateful": 18.0, "argument_transformation_stateful": 6.0, "grounded_synthesis_stateful": 21.0, "inconsistent_api_recovery_stateful": 78.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 46, "argument_transformation": 49, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 22.96, "argument_fidelity": 97.21, "tool_selection": 41.66, "basic_2step": 49.08, "sequential_3step": 100.63, "conditional_routing": 190.9, "sequential_reasoning": 109.43, "error_recovery": 59.44, "data_gap_recovery": 275.79, "data_gap_recovery_extended": 377.75, "argument_transformation": 519.84, "grounded_synthesis": 653.94, "inconsistent_api_recovery": 492.76, "relevance_detection_stateful": 23.11, "argument_fidelity_stateful": 97.91, "tool_selection_stateful": 40.37, "basic_2step_stateful": 57.73, "sequential_3step_stateful": 80.09, "conditional_routing_stateful": 198.39, "sequential_reasoning_stateful": 90.01, "error_recovery_stateful": 58.18, "data_gap_recovery_stateful": 229.13, "data_gap_recovery_extended_stateful": 332.53, "argument_transformation_stateful": 537.36, "grounded_synthesis_stateful": 684.16, "inconsistent_api_recovery_stateful": 495.15}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 46, "argument_transformation": 49, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}}, {"label": "granite-4.1-8b-Q4_K_M LS/P [bare]", "model": "granite-4.1-8b-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 46.2, "accuracy": 50.0, "completeness": 92.3, "efficiency": 100.0, "wasted": 0.0, "speed": 2.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 100, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 100, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 250, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 250, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 150, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 150, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 54.56, "argument_fidelity": 40.06, "tool_selection": 32.15, "basic_2step": 20.04, "sequential_3step": 40.27, "conditional_routing": 106.51, "sequential_reasoning": 53.06, "error_recovery": 0.0, "data_gap_recovery": 37.05, "data_gap_recovery_extended": 105.71, "argument_transformation": 470.74, "grounded_synthesis": 146.46, "inconsistent_api_recovery": 73.07, "relevance_detection_stateful": 57.01, "argument_fidelity_stateful": 40.06, "tool_selection_stateful": 35.73, "basic_2step_stateful": 22.04, "sequential_3step_stateful": 40.06, "conditional_routing_stateful": 101.17, "sequential_reasoning_stateful": 52.69, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 38.01, "data_gap_recovery_extended_stateful": 105.65, "argument_transformation_stateful": 470.51, "grounded_synthesis_stateful": 161.81, "inconsistent_api_recovery_stateful": 77.52}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite4.1:8b-q8_0 OL/N [bare]", "model": "granite4.1:8b-q8_0", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "granite-4.1-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 46.2, "accuracy": 60.0, "completeness": 76.9, "efficiency": 95.5, "wasted": 0.7, "speed": 3.1, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 108.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 50.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 108.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 50.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 73.35, "tool_selection": 0.0, "basic_2step": 44.85, "sequential_3step": 68.11, "conditional_routing": 160.59, "sequential_reasoning": 80.53, "error_recovery": 0.0, "data_gap_recovery": 219.63, "data_gap_recovery_extended": 206.7, "argument_transformation": 181.55, "grounded_synthesis": 312.01, "inconsistent_api_recovery": 213.27, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 73.33, "tool_selection_stateful": 0.0, "basic_2step_stateful": 43.04, "sequential_3step_stateful": 68.31, "conditional_routing_stateful": 168.47, "sequential_reasoning_stateful": 80.55, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 219.63, "data_gap_recovery_extended_stateful": 206.71, "argument_transformation_stateful": 181.6, "grounded_synthesis_stateful": 312.06, "inconsistent_api_recovery_stateful": 213.25}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.1-8b-Q8_0 LS/N [bare]", "model": "granite-4.1-8b-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "granite-4.1-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 46.2, "accuracy": 60.0, "completeness": 77.0, "efficiency": 95.5, "wasted": 1.1, "speed": 3.2, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 2, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 3, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 3, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 397.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 50.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 250.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 50.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 63.25, "tool_selection": 1.19, "basic_2step": 34.66, "sequential_3step": 52.62, "conditional_routing": 184.62, "sequential_reasoning": 80.64, "error_recovery": 0.0, "data_gap_recovery": 199.73, "data_gap_recovery_extended": 249.63, "argument_transformation": 250.23, "grounded_synthesis": 294.17, "inconsistent_api_recovery": 196.5, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 64.25, "tool_selection_stateful": 0.0, "basic_2step_stateful": 38.06, "sequential_3step_stateful": 52.54, "conditional_routing_stateful": 191.22, "sequential_reasoning_stateful": 80.32, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 204.5, "data_gap_recovery_extended_stateful": 244.85, "argument_transformation_stateful": 218.46, "grounded_synthesis_stateful": 307.53, "inconsistent_api_recovery_stateful": 200.78}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "qwen3:14b-q4_K_M OL/N [bare]", "model": "qwen3:14b-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 46.5, "accuracy": 61.3, "completeness": 75.8, "efficiency": 87.4, "wasted": 0.7, "speed": 34.7, "n": 50, "scenarios": {"relevance_detection": 90, "argument_fidelity": 92, "tool_selection": 4, "basic_2step": 4, "sequential_3step": 86, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 78, "data_gap_recovery_extended": 6, "argument_transformation": 4, "grounded_synthesis": 48, "inconsistent_api_recovery": 6, "relevance_detection_stateful": 86, "argument_fidelity_stateful": 92, "tool_selection_stateful": 2, "basic_2step_stateful": 26, "sequential_3step_stateful": 84, "conditional_routing_stateful": 68, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 86, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 45, "argument_fidelity": 46, "tool_selection": 2, "basic_2step": 2, "sequential_3step": 43, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 3, "argument_transformation": 2, "grounded_synthesis": 24, "inconsistent_api_recovery": 3, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 46, "tool_selection_stateful": 1, "basic_2step_stateful": 13, "sequential_3step_stateful": 42, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 21, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 45, "argument_fidelity": 46, "tool_selection": 2, "basic_2step": 2, "sequential_3step": 47, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 7, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 37, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 46, "tool_selection_stateful": 1, "basic_2step_stateful": 13, "sequential_3step_stateful": 45, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 45, "argument_fidelity": 46, "tool_selection": 2, "basic_2step": 2, "sequential_3step": 47, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 7, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 37, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 46, "tool_selection_stateful": 1, "basic_2step_stateful": 13, "sequential_3step_stateful": 45, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 45, "argument_fidelity": 138, "tool_selection": 6, "basic_2step": 4, "sequential_3step": 129, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 195, "data_gap_recovery_extended": 24, "argument_transformation": 10, "grounded_synthesis": 240, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 138, "tool_selection_stateful": 3, "basic_2step_stateful": 26, "sequential_3step_stateful": 126, "conditional_routing_stateful": 136, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 215, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 210, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 45, "argument_fidelity": 138, "tool_selection": 6, "basic_2step": 4, "sequential_3step": 129, "conditional_routing": 227, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 189, "data_gap_recovery_extended": 22, "argument_transformation": 8, "grounded_synthesis": 392, "inconsistent_api_recovery": 26, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 138, "tool_selection_stateful": 3, "basic_2step_stateful": 26, "sequential_3step_stateful": 126, "conditional_routing_stateful": 169, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 213, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 342, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 29.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 291.0, "inconsistent_api_recovery": 3.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 33.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 280.0, "inconsistent_api_recovery_stateful": 4.0}, "scenarioWastedN": {"relevance_detection": 45, "argument_fidelity": 46, "tool_selection": 2, "basic_2step": 2, "sequential_3step": 47, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 7, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 37, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 46, "tool_selection_stateful": 1, "basic_2step_stateful": 13, "sequential_3step_stateful": 45, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 92.79, "argument_fidelity": 332.37, "tool_selection": 15.04, "basic_2step": 14.6, "sequential_3step": 654.97, "conditional_routing": 1250.52, "sequential_reasoning": 815.49, "error_recovery": 26.64, "data_gap_recovery": 1122.07, "data_gap_recovery_extended": 1668.12, "argument_transformation": 2277.67, "grounded_synthesis": 7149.99, "inconsistent_api_recovery": 1188.58, "relevance_detection_stateful": 88.72, "argument_fidelity_stateful": 356.57, "tool_selection_stateful": 7.94, "basic_2step_stateful": 60.78, "sequential_3step_stateful": 733.63, "conditional_routing_stateful": 1427.97, "sequential_reasoning_stateful": 872.97, "error_recovery_stateful": 28.54, "data_gap_recovery_stateful": 1170.31, "data_gap_recovery_extended_stateful": 1790.73, "argument_transformation_stateful": 2999.66, "grounded_synthesis_stateful": 6843.9, "inconsistent_api_recovery_stateful": 1149.66}, "scenarioSpeedN": {"relevance_detection": 45, "argument_fidelity": 46, "tool_selection": 2, "basic_2step": 2, "sequential_3step": 47, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 7, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 37, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 46, "tool_selection_stateful": 1, "basic_2step_stateful": 13, "sequential_3step_stateful": 45, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [bare]", "model": "Ministral-3-8B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "ministral-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 45.8, "accuracy": 81.4, "completeness": 56.3, "efficiency": 93.8, "wasted": 0.8, "speed": 2.6, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 98, "data_gap_recovery_extended": 0, "argument_transformation": 16, "grounded_synthesis": 26, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 52, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 13, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 0, "argument_transformation": 46, "grounded_synthesis": 14, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 0, "argument_transformation": 46, "grounded_synthesis": 14, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 245, "data_gap_recovery_extended": 0, "argument_transformation": 40, "grounded_synthesis": 130, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 260, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 147, "data_gap_recovery_extended": 0, "argument_transformation": 38, "grounded_synthesis": 169, "inconsistent_api_recovery": 550, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 338, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 28.0, "grounded_synthesis": 39.0, "inconsistent_api_recovery": 150.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 16.0, "grounded_synthesis_stateful": 78.0, "inconsistent_api_recovery_stateful": 150.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 0, "argument_transformation": 46, "grounded_synthesis": 14, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 37.33, "tool_selection": 0.0, "basic_2step": 19.45, "sequential_3step": 90.69, "conditional_routing": 113.48, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 119.72, "data_gap_recovery_extended": 0.0, "argument_transformation": 277.77, "grounded_synthesis": 71.55, "inconsistent_api_recovery": 188.12, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 37.73, "tool_selection_stateful": 0.0, "basic_2step_stateful": 21.4, "sequential_3step_stateful": 91.85, "conditional_routing_stateful": 111.31, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 121.93, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 300.82, "grounded_synthesis_stateful": 123.06, "inconsistent_api_recovery_stateful": 187.2}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 0, "argument_transformation": 46, "grounded_synthesis": 14, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q4_K_M LS/N [bare]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 44.6, "accuracy": 63.0, "completeness": 70.8, "efficiency": 100.0, "wasted": 0.1, "speed": 13.8, "n": 50, "scenarios": {"relevance_detection": 90, "argument_fidelity": 74, "tool_selection": 2, "basic_2step": 88, "sequential_3step": 74, "conditional_routing": 90, "sequential_reasoning": 86, "error_recovery": 0, "data_gap_recovery": 60, "data_gap_recovery_extended": 2, "argument_transformation": 16, "grounded_synthesis": 16, "inconsistent_api_recovery": 6, "relevance_detection_stateful": 88, "argument_fidelity_stateful": 82, "tool_selection_stateful": 6, "basic_2step_stateful": 90, "sequential_3step_stateful": 72, "conditional_routing_stateful": 76, "sequential_reasoning_stateful": 70, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 60, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 45, "argument_fidelity": 37, "tool_selection": 1, "basic_2step": 44, "sequential_3step": 37, "conditional_routing": 45, "sequential_reasoning": 43, "error_recovery": 0, "data_gap_recovery": 30, "data_gap_recovery_extended": 1, "argument_transformation": 8, "grounded_synthesis": 8, "inconsistent_api_recovery": 3, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 41, "tool_selection_stateful": 3, "basic_2step_stateful": 45, "sequential_3step_stateful": 36, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 35, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 45, "argument_fidelity": 37, "tool_selection": 1, "basic_2step": 44, "sequential_3step": 40, "conditional_routing": 48, "sequential_reasoning": 43, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 41, "argument_transformation": 29, "grounded_synthesis": 38, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 42, "tool_selection_stateful": 3, "basic_2step_stateful": 47, "sequential_3step_stateful": 37, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 35, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 41, "grounded_synthesis_stateful": 35, "inconsistent_api_recovery_stateful": 48}, "scenarioValidated": {"relevance_detection": 45, "argument_fidelity": 37, "tool_selection": 1, "basic_2step": 44, "sequential_3step": 40, "conditional_routing": 48, "sequential_reasoning": 43, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 41, "argument_transformation": 29, "grounded_synthesis": 38, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 42, "tool_selection_stateful": 3, "basic_2step_stateful": 47, "sequential_3step_stateful": 37, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 35, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 41, "grounded_synthesis_stateful": 35, "inconsistent_api_recovery_stateful": 48}, "scenarioIdealCalls": {"relevance_detection": 45, "argument_fidelity": 111, "tool_selection": 3, "basic_2step": 88, "sequential_3step": 111, "conditional_routing": 180, "sequential_reasoning": 172, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 8, "argument_transformation": 40, "grounded_synthesis": 80, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 123, "tool_selection_stateful": 9, "basic_2step_stateful": 90, "sequential_3step_stateful": 108, "conditional_routing_stateful": 152, "sequential_reasoning_stateful": 140, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 45, "argument_fidelity": 111, "tool_selection": 3, "basic_2step": 88, "sequential_3step": 111, "conditional_routing": 221, "sequential_reasoning": 172, "error_recovery": 0, "data_gap_recovery": 126, "data_gap_recovery_extended": 5, "argument_transformation": 33, "grounded_synthesis": 55, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 123, "tool_selection_stateful": 9, "basic_2step_stateful": 90, "sequential_3step_stateful": 108, "conditional_routing_stateful": 190, "sequential_reasoning_stateful": 140, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 129, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 23, "grounded_synthesis_stateful": 9, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 41.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 8.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 6.0, "grounded_synthesis": 6.0, "inconsistent_api_recovery": 8.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 38.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 7.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 10.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 4.0}, "scenarioWastedN": {"relevance_detection": 45, "argument_fidelity": 37, "tool_selection": 1, "basic_2step": 44, "sequential_3step": 40, "conditional_routing": 48, "sequential_reasoning": 43, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 41, "argument_transformation": 29, "grounded_synthesis": 38, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 42, "tool_selection_stateful": 3, "basic_2step_stateful": 47, "sequential_3step_stateful": 37, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 35, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 41, "grounded_synthesis_stateful": 35, "inconsistent_api_recovery_stateful": 48}, "scenarioSpeedSum": {"relevance_detection": 46.16, "argument_fidelity": 229.61, "tool_selection": 4.67, "basic_2step": 134.95, "sequential_3step": 335.57, "conditional_routing": 903.96, "sequential_reasoning": 503.71, "error_recovery": 0.0, "data_gap_recovery": 552.36, "data_gap_recovery_extended": 785.85, "argument_transformation": 1036.15, "grounded_synthesis": 1098.94, "inconsistent_api_recovery": 673.96, "relevance_detection_stateful": 45.82, "argument_fidelity_stateful": 260.45, "tool_selection_stateful": 13.07, "basic_2step_stateful": 138.96, "sequential_3step_stateful": 334.11, "conditional_routing_stateful": 855.39, "sequential_reasoning_stateful": 434.5, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 558.88, "data_gap_recovery_extended_stateful": 834.25, "argument_transformation_stateful": 1314.87, "grounded_synthesis_stateful": 950.09, "inconsistent_api_recovery_stateful": 708.23}, "scenarioSpeedN": {"relevance_detection": 45, "argument_fidelity": 37, "tool_selection": 1, "basic_2step": 44, "sequential_3step": 40, "conditional_routing": 48, "sequential_reasoning": 43, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 41, "argument_transformation": 29, "grounded_synthesis": 38, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 42, "tool_selection_stateful": 3, "basic_2step_stateful": 47, "sequential_3step_stateful": 37, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 35, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 41, "grounded_synthesis_stateful": 35, "inconsistent_api_recovery_stateful": 48}}, {"label": "Mistral-7B-Instruct-v0.3.Q4_K_M LF/P [reforged]", "model": "Mistral-7B-Instruct-v0.3.Q4_K_M", "backend": "llamafile", "mode": "prompt", "ablation": "reforged", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 44.2, "accuracy": 45.6, "completeness": 96.9, "efficiency": 86.7, "wasted": 0.7, "speed": 5.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 12, "sequential_reasoning": 82, "error_recovery": 16, "data_gap_recovery": 22, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 14, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 90, "conditional_routing_stateful": 26, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 18, "data_gap_recovery_stateful": 12, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 22, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 6, "sequential_reasoning": 41, "error_recovery": 8, "data_gap_recovery": 11, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 5, "inconsistent_api_recovery": 7, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 45, "conditional_routing_stateful": 13, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 11, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 45, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 45, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 24, "sequential_reasoning": 164, "error_recovery": 16, "data_gap_recovery": 55, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 56, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 135, "conditional_routing_stateful": 52, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 27, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 110, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 199, "tool_selection": 100, "basic_2step": 150, "sequential_3step": 212, "conditional_routing": 21, "sequential_reasoning": 142, "error_recovery": 32, "data_gap_recovery": 56, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 40, "inconsistent_api_recovery": 81, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 200, "tool_selection_stateful": 0, "basic_2step_stateful": 147, "sequential_3step_stateful": 186, "conditional_routing_stateful": 54, "sequential_reasoning_stateful": 53, "error_recovery_stateful": 36, "data_gap_recovery_stateful": 32, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 64, "inconsistent_api_recovery_stateful": 27}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 49.0, "tool_selection": 0.0, "basic_2step": 50.0, "sequential_3step": 64.0, "conditional_routing": 35.0, "sequential_reasoning": 37.0, "error_recovery": 102.0, "data_gap_recovery": 14.0, "data_gap_recovery_extended": 4.0, "argument_transformation": 2.0, "grounded_synthesis": 36.0, "inconsistent_api_recovery": 53.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 50.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 47.0, "sequential_3step_stateful": 54.0, "conditional_routing_stateful": 36.0, "sequential_reasoning_stateful": 21.0, "error_recovery_stateful": 52.0, "data_gap_recovery_stateful": 16.0, "data_gap_recovery_extended_stateful": 9.0, "argument_transformation_stateful": 5.0, "grounded_synthesis_stateful": 28.0, "inconsistent_api_recovery_stateful": 61.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 45, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 30.01, "argument_fidelity": 132.95, "tool_selection": 57.16, "basic_2step": 73.43, "sequential_3step": 135.85, "conditional_routing": 257.72, "sequential_reasoning": 156.75, "error_recovery": 94.68, "data_gap_recovery": 251.89, "data_gap_recovery_extended": 375.06, "argument_transformation": 630.96, "grounded_synthesis": 831.62, "inconsistent_api_recovery": 594.59, "relevance_detection_stateful": 30.3, "argument_fidelity_stateful": 133.22, "tool_selection_stateful": 57.68, "basic_2step_stateful": 88.08, "sequential_3step_stateful": 125.86, "conditional_routing_stateful": 257.93, "sequential_reasoning_stateful": 123.91, "error_recovery_stateful": 89.08, "data_gap_recovery_stateful": 245.63, "data_gap_recovery_extended_stateful": 389.59, "argument_transformation_stateful": 717.4, "grounded_synthesis_stateful": 799.3, "inconsistent_api_recovery_stateful": 605.87}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 45, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}}, {"label": "Mistral-7B-Instruct-v0.3-Q4_K_M LS/P [reforged]", "model": "Mistral-7B-Instruct-v0.3-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 43.9, "accuracy": 45.5, "completeness": 96.5, "efficiency": 84.6, "wasted": 0.7, "speed": 3.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 26, "sequential_reasoning": 84, "error_recovery": 12, "data_gap_recovery": 30, "data_gap_recovery_extended": 2, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 14, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 66, "conditional_routing_stateful": 18, "sequential_reasoning_stateful": 22, "error_recovery_stateful": 6, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 14, "inconsistent_api_recovery_stateful": 8}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 13, "sequential_reasoning": 42, "error_recovery": 6, "data_gap_recovery": 15, "data_gap_recovery_extended": 1, "argument_transformation": 0, "grounded_synthesis": 5, "inconsistent_api_recovery": 7, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 33, "conditional_routing_stateful": 9, "sequential_reasoning_stateful": 11, "error_recovery_stateful": 3, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 7, "inconsistent_api_recovery_stateful": 4}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 41, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 36, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 41, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 36, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 52, "sequential_reasoning": 168, "error_recovery": 12, "data_gap_recovery": 75, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 56, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 99, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 44, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 75, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 70, "inconsistent_api_recovery_stateful": 32}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 200, "tool_selection": 100, "basic_2step": 150, "sequential_3step": 250, "conditional_routing": 69, "sequential_reasoning": 103, "error_recovery": 24, "data_gap_recovery": 84, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 33, "inconsistent_api_recovery": 65, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 200, "tool_selection_stateful": 0, "basic_2step_stateful": 150, "sequential_3step_stateful": 149, "conditional_routing_stateful": 37, "sequential_reasoning_stateful": 68, "error_recovery_stateful": 12, "data_gap_recovery_stateful": 83, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 65, "inconsistent_api_recovery_stateful": 43}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 50.0, "tool_selection": 0.0, "basic_2step": 50.0, "sequential_3step": 100.0, "conditional_routing": 60.0, "sequential_reasoning": 14.0, "error_recovery": 100.0, "data_gap_recovery": 15.0, "data_gap_recovery_extended": 5.0, "argument_transformation": 3.0, "grounded_synthesis": 9.0, "inconsistent_api_recovery": 56.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 50.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 60.0, "conditional_routing_stateful": 56.0, "sequential_reasoning_stateful": 27.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 21.0, "data_gap_recovery_extended_stateful": 15.0, "argument_transformation_stateful": 7.0, "grounded_synthesis_stateful": 46.0, "inconsistent_api_recovery_stateful": 56.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 41, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 36, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 13.75, "argument_fidelity": 63.3, "tool_selection": 26.66, "basic_2step": 33.66, "sequential_3step": 72.21, "conditional_routing": 152.98, "sequential_reasoning": 68.95, "error_recovery": 37.12, "data_gap_recovery": 149.42, "data_gap_recovery_extended": 203.01, "argument_transformation": 344.15, "grounded_synthesis": 443.08, "inconsistent_api_recovery": 311.11, "relevance_detection_stateful": 13.84, "argument_fidelity_stateful": 62.59, "tool_selection_stateful": 27.2, "basic_2step_stateful": 40.3, "sequential_3step_stateful": 46.53, "conditional_routing_stateful": 147.85, "sequential_reasoning_stateful": 59.9, "error_recovery_stateful": 39.37, "data_gap_recovery_stateful": 154.54, "data_gap_recovery_extended_stateful": 210.4, "argument_transformation_stateful": 346.92, "grounded_synthesis_stateful": 518.39, "inconsistent_api_recovery_stateful": 314.31}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 41, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 36, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [bare]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "ministral-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 43.6, "accuracy": 76.5, "completeness": 57.0, "efficiency": 100.0, "wasted": 0.2, "speed": 5.3, "n": 50, "scenarios": {"relevance_detection": 4, "argument_fidelity": 100, "tool_selection": 94, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 84, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 18, "argument_transformation": 12, "grounded_synthesis": 24, "inconsistent_api_recovery": 76, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 100, "tool_selection_stateful": 96, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 82, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 2, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 42, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 9, "argument_transformation": 6, "grounded_synthesis": 12, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 50, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 41, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 14, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 2, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 9, "argument_transformation": 46, "grounded_synthesis": 31, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 50, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 43}, "scenarioValidated": {"relevance_detection": 2, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 9, "argument_transformation": 46, "grounded_synthesis": 31, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 50, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 43}, "scenarioIdealCalls": {"relevance_detection": 2, "argument_fidelity": 150, "tool_selection": 141, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 168, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 72, "argument_transformation": 30, "grounded_synthesis": 120, "inconsistent_api_recovery": 304, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 150, "tool_selection_stateful": 144, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 164, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 140, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 2, "argument_fidelity": 150, "tool_selection": 141, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 189, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 8, "data_gap_recovery_extended": 48, "argument_transformation": 30, "grounded_synthesis": 70, "inconsistent_api_recovery": 329, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 150, "tool_selection_stateful": 144, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 191, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 25, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 81, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 31.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 18.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 41.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 33.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 7.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 37.0}, "scenarioWastedN": {"relevance_detection": 2, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 9, "argument_transformation": 46, "grounded_synthesis": 31, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 50, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 43}, "scenarioSpeedSum": {"relevance_detection": 0.73, "argument_fidelity": 51.65, "tool_selection": 37.43, "basic_2step": 22.54, "sequential_3step": 48.98, "conditional_routing": 253.31, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 9.1, "data_gap_recovery_extended": 57.17, "argument_transformation": 686.56, "grounded_synthesis": 254.03, "inconsistent_api_recovery": 409.48, "relevance_detection_stateful": 0.74, "argument_fidelity_stateful": 50.24, "tool_selection_stateful": 37.34, "basic_2step_stateful": 26.06, "sequential_3step_stateful": 48.74, "conditional_routing_stateful": 286.49, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 44.69, "argument_transformation_stateful": 679.59, "grounded_synthesis_stateful": 357.97, "inconsistent_api_recovery_stateful": 531.48}, "scenarioSpeedN": {"relevance_detection": 2, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 9, "argument_transformation": 46, "grounded_synthesis": 31, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 50, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 43}}, {"label": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/N [bare]", "model": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "mistral-small-3.2", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 42.5, "accuracy": 67.4, "completeness": 63.1, "efficiency": 100.0, "wasted": 0.0, "speed": 6.1, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 68, "sequential_reasoning": 10, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 4, "argument_transformation": 12, "grounded_synthesis": 12, "inconsistent_api_recovery": 98, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 56, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 34, "sequential_reasoning": 5, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 2, "argument_transformation": 6, "grounded_synthesis": 6, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 28, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 5, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 15, "argument_transformation": 50, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 5, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 15, "argument_transformation": 50, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 136, "sequential_reasoning": 20, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 16, "argument_transformation": 30, "grounded_synthesis": 60, "inconsistent_api_recovery": 392, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 112, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 136, "sequential_reasoning": 20, "error_recovery": 0, "data_gap_recovery": 8, "data_gap_recovery_extended": 9, "argument_transformation": 23, "grounded_synthesis": 34, "inconsistent_api_recovery": 249, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 112, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 51, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 1.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 5, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 15, "argument_transformation": 50, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 88.58, "tool_selection": 57.49, "basic_2step": 40.15, "sequential_3step": 86.77, "conditional_routing": 400.39, "sequential_reasoning": 23.68, "error_recovery": 0.0, "data_gap_recovery": 23.93, "data_gap_recovery_extended": 142.31, "argument_transformation": 712.42, "grounded_synthesis": 511.37, "inconsistent_api_recovery": 369.29, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 88.87, "tool_selection_stateful": 57.66, "basic_2step_stateful": 48.83, "sequential_3step_stateful": 84.93, "conditional_routing_stateful": 369.64, "sequential_reasoning_stateful": 50.28, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 10.55, "data_gap_recovery_extended_stateful": 140.11, "argument_transformation_stateful": 686.06, "grounded_synthesis_stateful": 632.49, "inconsistent_api_recovery_stateful": 341.32}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 5, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 15, "argument_transformation": 50, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 49}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [bare]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "ministral-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 43.3, "accuracy": 78.5, "completeness": 55.2, "efficiency": 100.0, "wasted": 0.3, "speed": 5.6, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 54, "basic_2step": 74, "sequential_3step": 94, "conditional_routing": 74, "sequential_reasoning": 8, "error_recovery": 0, "data_gap_recovery": 22, "data_gap_recovery_extended": 22, "argument_transformation": 28, "grounded_synthesis": 40, "inconsistent_api_recovery": 84, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 52, "basic_2step_stateful": 90, "sequential_3step_stateful": 94, "conditional_routing_stateful": 76, "sequential_reasoning_stateful": 6, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 24, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 14}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 27, "basic_2step": 37, "sequential_3step": 47, "conditional_routing": 37, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 11, "data_gap_recovery_extended": 11, "argument_transformation": 14, "grounded_synthesis": 20, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 26, "basic_2step_stateful": 45, "sequential_3step_stateful": 47, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 3, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 12, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 13, "inconsistent_api_recovery_stateful": 7}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 27, "basic_2step": 37, "sequential_3step": 47, "conditional_routing": 39, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 14, "data_gap_recovery_extended": 20, "argument_transformation": 45, "grounded_synthesis": 24, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 26, "basic_2step_stateful": 45, "sequential_3step_stateful": 47, "conditional_routing_stateful": 41, "sequential_reasoning_stateful": 3, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 21, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 46}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 27, "basic_2step": 37, "sequential_3step": 47, "conditional_routing": 39, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 14, "data_gap_recovery_extended": 20, "argument_transformation": 45, "grounded_synthesis": 24, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 26, "basic_2step_stateful": 45, "sequential_3step_stateful": 47, "conditional_routing_stateful": 41, "sequential_reasoning_stateful": 3, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 21, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 46}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 81, "basic_2step": 74, "sequential_3step": 141, "conditional_routing": 148, "sequential_reasoning": 16, "error_recovery": 0, "data_gap_recovery": 55, "data_gap_recovery_extended": 88, "argument_transformation": 70, "grounded_synthesis": 200, "inconsistent_api_recovery": 336, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 78, "basic_2step_stateful": 90, "sequential_3step_stateful": 141, "conditional_routing_stateful": 152, "sequential_reasoning_stateful": 12, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 60, "data_gap_recovery_extended_stateful": 96, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 130, "inconsistent_api_recovery_stateful": 56}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 81, "basic_2step": 74, "sequential_3step": 141, "conditional_routing": 142, "sequential_reasoning": 16, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 61, "argument_transformation": 65, "grounded_synthesis": 104, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 78, "basic_2step_stateful": 90, "sequential_3step_stateful": 141, "conditional_routing_stateful": 156, "sequential_reasoning_stateful": 12, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 55, "data_gap_recovery_extended_stateful": 72, "argument_transformation_stateful": 51, "grounded_synthesis_stateful": 64, "inconsistent_api_recovery_stateful": 84}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 14.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 14.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 78.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 21.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 4.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 13.0, "grounded_synthesis_stateful": 8.0, "inconsistent_api_recovery_stateful": 67.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 27, "basic_2step": 37, "sequential_3step": 47, "conditional_routing": 39, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 14, "data_gap_recovery_extended": 20, "argument_transformation": 45, "grounded_synthesis": 24, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 26, "basic_2step_stateful": 45, "sequential_3step_stateful": 47, "conditional_routing_stateful": 41, "sequential_reasoning_stateful": 3, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 21, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 46}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 46.58, "tool_selection": 18.58, "basic_2step": 16.64, "sequential_3step": 52.93, "conditional_routing": 230.91, "sequential_reasoning": 10.96, "error_recovery": 0.0, "data_gap_recovery": 70.45, "data_gap_recovery_extended": 203.99, "argument_transformation": 606.31, "grounded_synthesis": 324.89, "inconsistent_api_recovery": 405.76, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 49.15, "tool_selection_stateful": 18.54, "basic_2step_stateful": 22.08, "sequential_3step_stateful": 60.33, "conditional_routing_stateful": 249.7, "sequential_reasoning_stateful": 7.27, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 97.27, "data_gap_recovery_extended_stateful": 167.52, "argument_transformation_stateful": 605.89, "grounded_synthesis_stateful": 423.38, "inconsistent_api_recovery_stateful": 355.19}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 27, "basic_2step": 37, "sequential_3step": 47, "conditional_routing": 39, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 14, "data_gap_recovery_extended": 20, "argument_transformation": 45, "grounded_synthesis": 24, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 26, "basic_2step_stateful": 45, "sequential_3step_stateful": 47, "conditional_routing_stateful": 41, "sequential_reasoning_stateful": 3, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 21, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 46}}, {"label": "granite-4.1-8b-Q8_0 LS/P [bare]", "model": "granite-4.1-8b-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "granite-4.1-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 42.3, "accuracy": 50.0, "completeness": 84.6, "efficiency": 86.0, "wasted": 0.4, "speed": 2.3, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 100, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 500, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 695, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 695, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 196.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 196.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 61.55, "tool_selection": 35.06, "basic_2step": 31.03, "sequential_3step": 61.04, "conditional_routing": 142.58, "sequential_reasoning": 78.07, "error_recovery": 0.0, "data_gap_recovery": 59.55, "data_gap_recovery_extended": 98.13, "argument_transformation": 310.74, "grounded_synthesis": 273.22, "inconsistent_api_recovery": 115.12, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 61.72, "tool_selection_stateful": 35.15, "basic_2step_stateful": 34.03, "sequential_3step_stateful": 64.66, "conditional_routing_stateful": 141.95, "sequential_reasoning_stateful": 75.38, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 60.51, "data_gap_recovery_extended_stateful": 98.14, "argument_transformation_stateful": 292.71, "grounded_synthesis_stateful": 273.0, "inconsistent_api_recovery_stateful": 118.04}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.0-h-micro-Q4_K_M LS/N [bare]", "model": "granite-4.0-h-micro-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "granite-4.0-h-micro", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 42.2, "accuracy": 55.0, "completeness": 76.8, "efficiency": 100.0, "wasted": 0.2, "speed": 3.4, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 100, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 98, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 500, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 392, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 200, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 450, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 200, "data_gap_recovery_extended_stateful": 294, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 548, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 100.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 49.0, "inconsistent_api_recovery_stateful": 100.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 75.19, "tool_selection": 0.0, "basic_2step": 43.55, "sequential_3step": 68.11, "conditional_routing": 219.17, "sequential_reasoning": 90.8, "error_recovery": 0.0, "data_gap_recovery": 157.3, "data_gap_recovery_extended": 220.95, "argument_transformation": 134.97, "grounded_synthesis": 269.28, "inconsistent_api_recovery": 354.72, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 75.14, "tool_selection_stateful": 0.0, "basic_2step_stateful": 52.01, "sequential_3step_stateful": 68.92, "conditional_routing_stateful": 151.42, "sequential_reasoning_stateful": 88.71, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 157.11, "data_gap_recovery_extended_stateful": 325.68, "argument_transformation_stateful": 134.3, "grounded_synthesis_stateful": 333.48, "inconsistent_api_recovery_stateful": 359.22}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [bare]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "ministral-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 42.5, "accuracy": 74.8, "completeness": 56.8, "efficiency": 100.0, "wasted": 0.3, "speed": 3.6, "n": 50, "scenarios": {"relevance_detection": 12, "argument_fidelity": 100, "tool_selection": 90, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 72, "sequential_reasoning": 2, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 12, "argument_transformation": 12, "grounded_synthesis": 14, "inconsistent_api_recovery": 84, "relevance_detection_stateful": 20, "argument_fidelity_stateful": 100, "tool_selection_stateful": 78, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 64, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 6, "argument_fidelity": 50, "tool_selection": 45, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 36, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 6, "argument_transformation": 6, "grounded_synthesis": 7, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 6, "argument_fidelity": 50, "tool_selection": 45, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 36, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 9, "argument_transformation": 45, "grounded_synthesis": 31, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 35, "inconsistent_api_recovery_stateful": 48}, "scenarioValidated": {"relevance_detection": 6, "argument_fidelity": 50, "tool_selection": 45, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 36, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 9, "argument_transformation": 45, "grounded_synthesis": 31, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 35, "inconsistent_api_recovery_stateful": 48}, "scenarioIdealCalls": {"relevance_detection": 6, "argument_fidelity": 150, "tool_selection": 135, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 144, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 48, "argument_transformation": 30, "grounded_synthesis": 70, "inconsistent_api_recovery": 336, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 150, "tool_selection_stateful": 117, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 128, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 120, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 6, "argument_fidelity": 150, "tool_selection": 135, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 168, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 22, "argument_transformation": 25, "grounded_synthesis": 41, "inconsistent_api_recovery": 366, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 150, "tool_selection_stateful": 117, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 140, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 26, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 72, "inconsistent_api_recovery_stateful": 24}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 29.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 2.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 53.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 22.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 22.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 64.0}, "scenarioWastedN": {"relevance_detection": 6, "argument_fidelity": 50, "tool_selection": 45, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 36, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 9, "argument_transformation": 45, "grounded_synthesis": 31, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 35, "inconsistent_api_recovery_stateful": 48}, "scenarioSpeedSum": {"relevance_detection": 3.02, "argument_fidelity": 33.95, "tool_selection": 22.83, "basic_2step": 14.96, "sequential_3step": 27.64, "conditional_routing": 191.85, "sequential_reasoning": 2.0, "error_recovery": 0.0, "data_gap_recovery": 2.76, "data_gap_recovery_extended": 38.49, "argument_transformation": 439.66, "grounded_synthesis": 206.3, "inconsistent_api_recovery": 340.19, "relevance_detection_stateful": 3.1, "argument_fidelity_stateful": 33.18, "tool_selection_stateful": 19.64, "basic_2step_stateful": 17.06, "sequential_3step_stateful": 27.2, "conditional_routing_stateful": 156.17, "sequential_reasoning_stateful": 5.19, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 44.66, "argument_transformation_stateful": 458.59, "grounded_synthesis_stateful": 226.05, "inconsistent_api_recovery_stateful": 367.09}, "scenarioSpeedN": {"relevance_detection": 6, "argument_fidelity": 50, "tool_selection": 45, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 36, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 9, "argument_transformation": 45, "grounded_synthesis": 31, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 35, "inconsistent_api_recovery_stateful": 48}}, {"label": "qwen3:8b-q4_K_M OL/N [bare]", "model": "qwen3:8b-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 40.7, "accuracy": 53.0, "completeness": 76.8, "efficiency": 96.2, "wasted": 0.1, "speed": 15.8, "n": 50, "scenarios": {"relevance_detection": 56, "argument_fidelity": 98, "tool_selection": 2, "basic_2step": 4, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 100, "error_recovery": 2, "data_gap_recovery": 38, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 0, "inconsistent_api_recovery": 12, "relevance_detection_stateful": 68, "argument_fidelity_stateful": 100, "tool_selection_stateful": 6, "basic_2step_stateful": 78, "sequential_3step_stateful": 98, "conditional_routing_stateful": 74, "sequential_reasoning_stateful": 96, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 28, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 28, "argument_fidelity": 49, "tool_selection": 1, "basic_2step": 2, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 1, "data_gap_recovery": 19, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 0, "inconsistent_api_recovery": 6, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 39, "sequential_3step_stateful": 49, "conditional_routing_stateful": 37, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 14, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 28, "argument_fidelity": 49, "tool_selection": 1, "basic_2step": 2, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 41, "data_gap_recovery": 28, "data_gap_recovery_extended": 42, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 39, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 28, "argument_fidelity": 49, "tool_selection": 1, "basic_2step": 2, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 41, "data_gap_recovery": 28, "data_gap_recovery_extended": 42, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 39, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 28, "argument_fidelity": 147, "tool_selection": 3, "basic_2step": 4, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 200, "error_recovery": 2, "data_gap_recovery": 95, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 0, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 150, "tool_selection_stateful": 9, "basic_2step_stateful": 78, "sequential_3step_stateful": 147, "conditional_routing_stateful": 148, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 70, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 28, "argument_fidelity": 147, "tool_selection": 3, "basic_2step": 4, "sequential_3step": 150, "conditional_routing": 222, "sequential_reasoning": 200, "error_recovery": 1, "data_gap_recovery": 98, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 0, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 150, "tool_selection_stateful": 9, "basic_2step_stateful": 78, "sequential_3step_stateful": 147, "conditional_routing_stateful": 185, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 68, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 38.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 1.0, "inconsistent_api_recovery": 2.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 37.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 28, "argument_fidelity": 49, "tool_selection": 1, "basic_2step": 2, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 41, "data_gap_recovery": 28, "data_gap_recovery_extended": 42, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 39, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 88.7, "argument_fidelity": 263.29, "tool_selection": 5.57, "basic_2step": 8.35, "sequential_3step": 409.89, "conditional_routing": 897.67, "sequential_reasoning": 350.9, "error_recovery": 138.66, "data_gap_recovery": 384.11, "data_gap_recovery_extended": 800.46, "argument_transformation": 1672.38, "grounded_synthesis": 1694.51, "inconsistent_api_recovery": 1023.5, "relevance_detection_stateful": 105.93, "argument_fidelity_stateful": 301.7, "tool_selection_stateful": 20.16, "basic_2step_stateful": 140.21, "sequential_3step_stateful": 438.09, "conditional_routing_stateful": 960.04, "sequential_reasoning_stateful": 414.75, "error_recovery_stateful": 145.6, "data_gap_recovery_stateful": 319.34, "data_gap_recovery_extended_stateful": 846.06, "argument_transformation_stateful": 1965.91, "grounded_synthesis_stateful": 1540.69, "inconsistent_api_recovery_stateful": 880.64}, "scenarioSpeedN": {"relevance_detection": 28, "argument_fidelity": 49, "tool_selection": 1, "basic_2step": 2, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 41, "data_gap_recovery": 28, "data_gap_recovery_extended": 42, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 39, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 49}}, {"label": "granite4.1:8b-q4_K_M OL/N [bare]", "model": "granite4.1:8b-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 38.6, "accuracy": 50.2, "completeness": 76.9, "efficiency": 94.1, "wasted": 1.0, "speed": 2.1, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 7, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 400.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 50.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 400.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 50.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 49.75, "tool_selection": 0.0, "basic_2step": 27.15, "sequential_3step": 41.51, "conditional_routing": 106.74, "sequential_reasoning": 54.63, "error_recovery": 0.0, "data_gap_recovery": 125.98, "data_gap_recovery_extended": 155.92, "argument_transformation": 188.18, "grounded_synthesis": 172.18, "inconsistent_api_recovery": 147.02, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 49.76, "tool_selection_stateful": 0.0, "basic_2step_stateful": 23.04, "sequential_3step_stateful": 41.54, "conditional_routing_stateful": 97.21, "sequential_reasoning_stateful": 54.64, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 126.88, "data_gap_recovery_extended_stateful": 155.97, "argument_transformation_stateful": 188.41, "grounded_synthesis_stateful": 172.28, "inconsistent_api_recovery_stateful": 147.11}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q4_K_M LS/N [bare]", "model": "Meta-Llama-3.1-8B-Instruct-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 38.2, "accuracy": 45.3, "completeness": 84.5, "efficiency": 95.9, "wasted": 0.6, "speed": 1.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 2, "conditional_routing": 58, "sequential_reasoning": 56, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 4, "inconsistent_api_recovery": 28, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 98, "tool_selection_stateful": 54, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 52, "sequential_reasoning_stateful": 32, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 1, "conditional_routing": 29, "sequential_reasoning": 28, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 14, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 27, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 26, "sequential_reasoning_stateful": 16, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 28, "conditional_routing": 47, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 41, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 28, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 28, "conditional_routing": 47, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 41, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 28, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 3, "conditional_routing": 116, "sequential_reasoning": 112, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 20, "inconsistent_api_recovery": 112, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 81, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 104, "sequential_reasoning_stateful": 64, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 133, "basic_2step": 100, "sequential_3step": 3, "conditional_routing": 142, "sequential_reasoning": 83, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 24, "inconsistent_api_recovery": 151, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 81, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 130, "sequential_reasoning_stateful": 77, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 12, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 13, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 31.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 18.0, "grounded_synthesis": 160.0, "inconsistent_api_recovery": 77.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 18.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 32.0, "sequential_reasoning_stateful": 22.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 8.0, "grounded_synthesis_stateful": 220.0, "inconsistent_api_recovery_stateful": 78.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 28, "conditional_routing": 47, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 41, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 28, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 10.35, "argument_fidelity": 40.98, "tool_selection": 34.09, "basic_2step": 18.11, "sequential_3step": 7.08, "conditional_routing": 76.01, "sequential_reasoning": 41.78, "error_recovery": 0.0, "data_gap_recovery": 19.12, "data_gap_recovery_extended": 19.66, "argument_transformation": 47.61, "grounded_synthesis": 159.89, "inconsistent_api_recovery": 161.92, "relevance_detection_stateful": 10.61, "argument_fidelity_stateful": 43.52, "tool_selection_stateful": 38.5, "basic_2step_stateful": 20.24, "sequential_3step_stateful": 6.34, "conditional_routing_stateful": 72.96, "sequential_reasoning_stateful": 70.2, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 26.38, "data_gap_recovery_extended_stateful": 17.01, "argument_transformation_stateful": 29.77, "grounded_synthesis_stateful": 184.4, "inconsistent_api_recovery_stateful": 169.59}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 28, "conditional_routing": 47, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 41, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 28, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 49}}, {"label": "granite-4.0-h-tiny-Q4_K_M LS/N [reforged]", "model": "granite-4.0-h-tiny-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "granite-4.0-h-tiny", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 38.5, "accuracy": 45.7, "completeness": 84.2, "efficiency": 75.0, "wasted": 2.6, "speed": 3.8, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 201, "basic_2step": 200, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 300, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 200, "basic_2step_stateful": 200, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 300, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 51.0, "basic_2step": 100.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 100.0, "error_recovery": 0.0, "data_gap_recovery": 250.0, "data_gap_recovery_extended": 96.0, "argument_transformation": 452.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 250.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 100.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 100.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 250.0, "data_gap_recovery_extended_stateful": 92.0, "argument_transformation_stateful": 452.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 250.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 55.6, "tool_selection": 78.52, "basic_2step": 50.32, "sequential_3step": 46.96, "conditional_routing": 138.26, "sequential_reasoning": 156.9, "error_recovery": 0.0, "data_gap_recovery": 352.05, "data_gap_recovery_extended": 328.53, "argument_transformation": 260.35, "grounded_synthesis": 300.07, "inconsistent_api_recovery": 323.81, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 55.87, "tool_selection_stateful": 78.23, "basic_2step_stateful": 54.1, "sequential_3step_stateful": 47.07, "conditional_routing_stateful": 140.14, "sequential_reasoning_stateful": 154.63, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 360.19, "data_gap_recovery_extended_stateful": 317.45, "argument_transformation_stateful": 263.49, "grounded_synthesis_stateful": 298.47, "inconsistent_api_recovery_stateful": 322.8}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.0:h-micro-q4_K_M OL/N [reforged]", "model": "granite-4.0:h-micro-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "granite-4.0-h-micro", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 38.4, "accuracy": 55.6, "completeness": 69.1, "efficiency": 81.7, "wasted": 3.2, "speed": 7.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 98, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 245, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 392, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 300, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 196.0, "data_gap_recovery": 148.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 343.0, "grounded_synthesis": 400.0, "inconsistent_api_recovery": 348.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 100.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 147.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 4.0, "argument_transformation_stateful": 343.0, "grounded_synthesis_stateful": 400.0, "inconsistent_api_recovery_stateful": 348.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 66.86, "argument_fidelity": 98.78, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 79.72, "conditional_routing": 0.0, "sequential_reasoning": 122.43, "error_recovery": 342.05, "data_gap_recovery": 825.62, "data_gap_recovery_extended": 0.0, "argument_transformation": 401.08, "grounded_synthesis": 675.93, "inconsistent_api_recovery": 538.42, "relevance_detection_stateful": 66.75, "argument_fidelity_stateful": 97.65, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 79.72, "conditional_routing_stateful": 949.28, "sequential_reasoning_stateful": 120.57, "error_recovery_stateful": 337.54, "data_gap_recovery_stateful": 7.27, "data_gap_recovery_extended_stateful": 26.3, "argument_transformation_stateful": 403.09, "grounded_synthesis_stateful": 675.91, "inconsistent_api_recovery_stateful": 539.73}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q8_0 LS/N [bare]", "model": "Meta-Llama-3.1-8B-Instruct-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 36.5, "accuracy": 46.2, "completeness": 79.1, "efficiency": 96.2, "wasted": 0.8, "speed": 1.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 98, "tool_selection": 100, "basic_2step": 78, "sequential_3step": 4, "conditional_routing": 28, "sequential_reasoning": 62, "error_recovery": 0, "data_gap_recovery": 6, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 66, "basic_2step_stateful": 88, "sequential_3step_stateful": 0, "conditional_routing_stateful": 28, "sequential_reasoning_stateful": 44, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 50, "basic_2step": 39, "sequential_3step": 2, "conditional_routing": 14, "sequential_reasoning": 31, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 20, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 33, "basic_2step_stateful": 44, "sequential_3step_stateful": 0, "conditional_routing_stateful": 14, "sequential_reasoning_stateful": 22, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 10, "conditional_routing": 27, "sequential_reasoning": 42, "error_recovery": 12, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 36, "grounded_synthesis": 49, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 10, "conditional_routing_stateful": 29, "sequential_reasoning_stateful": 41, "error_recovery_stateful": 5, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 45}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 10, "conditional_routing": 27, "sequential_reasoning": 42, "error_recovery": 12, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 36, "grounded_synthesis": 49, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 10, "conditional_routing_stateful": 29, "sequential_reasoning_stateful": 41, "error_recovery_stateful": 5, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 45}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 147, "tool_selection": 150, "basic_2step": 78, "sequential_3step": 6, "conditional_routing": 56, "sequential_reasoning": 124, "error_recovery": 0, "data_gap_recovery": 15, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 160, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 99, "basic_2step_stateful": 88, "sequential_3step_stateful": 0, "conditional_routing_stateful": 56, "sequential_reasoning_stateful": 88, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 147, "tool_selection": 140, "basic_2step": 77, "sequential_3step": 6, "conditional_routing": 70, "sequential_reasoning": 94, "error_recovery": 0, "data_gap_recovery": 11, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 219, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 102, "basic_2step_stateful": 88, "sequential_3step_stateful": 0, "conditional_routing_stateful": 70, "sequential_reasoning_stateful": 99, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 9, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 16.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 133.0, "grounded_synthesis": 122.0, "inconsistent_api_recovery": 102.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 21.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 16.0, "sequential_reasoning_stateful": 18.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 66.0, "grounded_synthesis_stateful": 197.0, "inconsistent_api_recovery_stateful": 108.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 10, "conditional_routing": 27, "sequential_reasoning": 42, "error_recovery": 12, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 36, "grounded_synthesis": 49, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 10, "conditional_routing_stateful": 29, "sequential_reasoning_stateful": 41, "error_recovery_stateful": 5, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 45}, "scenarioSpeedSum": {"relevance_detection": 15.85, "argument_fidelity": 63.02, "tool_selection": 56.58, "basic_2step": 24.36, "sequential_3step": 5.54, "conditional_routing": 43.82, "sequential_reasoning": 60.86, "error_recovery": 3.31, "data_gap_recovery": 34.35, "data_gap_recovery_extended": 32.87, "argument_transformation": 103.09, "grounded_synthesis": 233.37, "inconsistent_api_recovery": 251.73, "relevance_detection_stateful": 16.15, "argument_fidelity_stateful": 64.36, "tool_selection_stateful": 63.02, "basic_2step_stateful": 28.95, "sequential_3step_stateful": 5.07, "conditional_routing_stateful": 42.62, "sequential_reasoning_stateful": 87.37, "error_recovery_stateful": 1.39, "data_gap_recovery_stateful": 36.72, "data_gap_recovery_extended_stateful": 33.12, "argument_transformation_stateful": 57.68, "grounded_synthesis_stateful": 263.53, "inconsistent_api_recovery_stateful": 254.25}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 10, "conditional_routing": 27, "sequential_reasoning": 42, "error_recovery": 12, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 36, "grounded_synthesis": 49, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 10, "conditional_routing_stateful": 29, "sequential_reasoning_stateful": 41, "error_recovery_stateful": 5, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 45}}, {"label": "Nemotron-3-Nano-30B-A3B-Q4_K_M LS/N [bare]", "model": "Nemotron-3-Nano-30B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "nemotron-3-nano", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 37.5, "accuracy": 85.7, "completeness": 43.7, "efficiency": 94.5, "wasted": 0.2, "speed": 6.6, "n": 50, "scenarios": {"relevance_detection": 32, "argument_fidelity": 100, "tool_selection": 88, "basic_2step": 74, "sequential_3step": 86, "conditional_routing": 70, "sequential_reasoning": 6, "error_recovery": 0, "data_gap_recovery": 20, "data_gap_recovery_extended": 10, "argument_transformation": 0, "grounded_synthesis": 14, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 14, "argument_fidelity_stateful": 100, "tool_selection_stateful": 94, "basic_2step_stateful": 72, "sequential_3step_stateful": 92, "conditional_routing_stateful": 56, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 18, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 16, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 37, "sequential_3step": 43, "conditional_routing": 35, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 5, "argument_transformation": 0, "grounded_synthesis": 7, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 36, "sequential_3step_stateful": 46, "conditional_routing_stateful": 28, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 9, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 16, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 37, "sequential_3step": 43, "conditional_routing": 47, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 12, "argument_transformation": 0, "grounded_synthesis": 22, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 37, "sequential_3step_stateful": 46, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 9, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 1}, "scenarioValidated": {"relevance_detection": 16, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 37, "sequential_3step": 43, "conditional_routing": 47, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 12, "argument_transformation": 0, "grounded_synthesis": 22, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 37, "sequential_3step_stateful": 46, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 9, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 1}, "scenarioIdealCalls": {"relevance_detection": 16, "argument_fidelity": 150, "tool_selection": 132, "basic_2step": 74, "sequential_3step": 129, "conditional_routing": 140, "sequential_reasoning": 12, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 40, "argument_transformation": 0, "grounded_synthesis": 70, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 150, "tool_selection_stateful": 141, "basic_2step_stateful": 72, "sequential_3step_stateful": 138, "conditional_routing_stateful": 112, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 56, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 16, "argument_fidelity": 150, "tool_selection": 132, "basic_2step": 74, "sequential_3step": 129, "conditional_routing": 154, "sequential_reasoning": 12, "error_recovery": 0, "data_gap_recovery": 52, "data_gap_recovery_extended": 37, "argument_transformation": 0, "grounded_synthesis": 100, "inconsistent_api_recovery": 18, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 150, "tool_selection_stateful": 141, "basic_2step_stateful": 72, "sequential_3step_stateful": 138, "conditional_routing_stateful": 140, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 51, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 74, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 19.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 44.0, "inconsistent_api_recovery": 2.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 28.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 44.0, "inconsistent_api_recovery_stateful": 1.0}, "scenarioWastedN": {"relevance_detection": 16, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 37, "sequential_3step": 43, "conditional_routing": 47, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 12, "argument_transformation": 0, "grounded_synthesis": 22, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 37, "sequential_3step_stateful": 46, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 9, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 1}, "scenarioSpeedSum": {"relevance_detection": 24.23, "argument_fidelity": 189.73, "tool_selection": 145.03, "basic_2step": 73.72, "sequential_3step": 174.26, "conditional_routing": 429.97, "sequential_reasoning": 39.61, "error_recovery": 0.0, "data_gap_recovery": 139.28, "data_gap_recovery_extended": 180.47, "argument_transformation": 0.0, "grounded_synthesis": 424.52, "inconsistent_api_recovery": 49.47, "relevance_detection_stateful": 9.56, "argument_fidelity_stateful": 177.04, "tool_selection_stateful": 144.92, "basic_2step_stateful": 85.47, "sequential_3step_stateful": 175.16, "conditional_routing_stateful": 419.76, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 107.89, "data_gap_recovery_extended_stateful": 186.55, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 563.2, "inconsistent_api_recovery_stateful": 29.92}, "scenarioSpeedN": {"relevance_detection": 16, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 37, "sequential_3step": 43, "conditional_routing": 47, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 12, "argument_transformation": 0, "grounded_synthesis": 22, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 37, "sequential_3step_stateful": 46, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 9, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 1}}, {"label": "Meta-Llama-3.1-8B-Instruct.Q4_K_M LF/P [bare]", "model": "Meta-Llama-3.1-8B-Instruct.Q4_K_M", "backend": "llamafile", "mode": "prompt", "ablation": "bare", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 36.1, "accuracy": 42.6, "completeness": 84.6, "efficiency": 98.6, "wasted": 0.4, "speed": 2.2, "n": 50, "scenarios": {"relevance_detection": 92, "argument_fidelity": 98, "tool_selection": 60, "basic_2step": 100, "sequential_3step": 42, "conditional_routing": 62, "sequential_reasoning": 10, "error_recovery": 0, "data_gap_recovery": 18, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 90, "argument_fidelity_stateful": 96, "tool_selection_stateful": 60, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 62, "sequential_reasoning_stateful": 8, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 12, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 46, "argument_fidelity": 49, "tool_selection": 30, "basic_2step": 50, "sequential_3step": 21, "conditional_routing": 31, "sequential_reasoning": 5, "error_recovery": 0, "data_gap_recovery": 9, "data_gap_recovery_extended": 4, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 48, "tool_selection_stateful": 30, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 31, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 11, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 11, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 46, "argument_fidelity": 147, "tool_selection": 90, "basic_2step": 100, "sequential_3step": 63, "conditional_routing": 124, "sequential_reasoning": 20, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 32, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 144, "tool_selection_stateful": 90, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 124, "sequential_reasoning_stateful": 16, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 46, "argument_fidelity": 148, "tool_selection": 79, "basic_2step": 100, "sequential_3step": 63, "conditional_routing": 131, "sequential_reasoning": 20, "error_recovery": 0, "data_gap_recovery": 34, "data_gap_recovery_extended": 22, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 146, "tool_selection_stateful": 90, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 137, "sequential_reasoning_stateful": 19, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 26, "data_gap_recovery_extended_stateful": 33, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 26}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 1.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 22.0, "sequential_reasoning": 16.0, "error_recovery": 0.0, "data_gap_recovery": 23.0, "data_gap_recovery_extended": 19.0, "argument_transformation": 31.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 18.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 2.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 85.0, "conditional_routing_stateful": 28.0, "sequential_reasoning_stateful": 72.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 29.0, "data_gap_recovery_extended_stateful": 27.0, "argument_transformation_stateful": 69.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 23.0}, "scenarioWastedN": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 11, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 19.36, "argument_fidelity": 63.56, "tool_selection": 53.96, "basic_2step": 38.56, "sequential_3step": 50.24, "conditional_routing": 109.39, "sequential_reasoning": 84.99, "error_recovery": 0.0, "data_gap_recovery": 109.28, "data_gap_recovery_extended": 146.13, "argument_transformation": 71.24, "grounded_synthesis": 243.69, "inconsistent_api_recovery": 166.1, "relevance_detection_stateful": 18.7, "argument_fidelity_stateful": 62.5, "tool_selection_stateful": 58.52, "basic_2step_stateful": 44.97, "sequential_3step_stateful": 73.02, "conditional_routing_stateful": 106.4, "sequential_reasoning_stateful": 107.61, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 102.97, "data_gap_recovery_extended_stateful": 154.78, "argument_transformation_stateful": 130.27, "grounded_synthesis_stateful": 243.07, "inconsistent_api_recovery_stateful": 166.6}, "scenarioSpeedN": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 11, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.0:h-tiny-q4_K_M OL/N [reforged]", "model": "granite-4.0:h-tiny-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "granite-4.0-h-tiny", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 35.5, "accuracy": 50.9, "completeness": 69.8, "efficiency": 78.9, "wasted": 1.4, "speed": 4.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 4, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 6, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 6, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 3, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 4, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 18, "data_gap_recovery_extended": 8, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 18, "data_gap_recovery_extended": 8, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 6, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 12, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 9, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 9, "basic_2step": 200, "sequential_3step": 150, "conditional_routing": 18, "sequential_reasoning": 250, "error_recovery": 200, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 13, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 25, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 200, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 3.0, "basic_2step": 100.0, "sequential_3step": 0.0, "conditional_routing": 6.0, "sequential_reasoning": 50.0, "error_recovery": 100.0, "data_gap_recovery": 47.0, "data_gap_recovery_extended": 4.0, "argument_transformation": 0.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 217.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 4.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 9.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 51.0, "data_gap_recovery_extended_stateful": 10.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 295.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 18, "data_gap_recovery_extended": 8, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 86.85, "tool_selection": 6.59, "basic_2step": 108.12, "sequential_3step": 80.18, "conditional_routing": 165.18, "sequential_reasoning": 156.15, "error_recovery": 115.15, "data_gap_recovery": 144.2, "data_gap_recovery_extended": 68.78, "argument_transformation": 133.34, "grounded_synthesis": 345.22, "inconsistent_api_recovery": 389.7, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 86.69, "tool_selection_stateful": 9.64, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 80.8, "conditional_routing_stateful": 177.47, "sequential_reasoning_stateful": 155.68, "error_recovery_stateful": 116.07, "data_gap_recovery_stateful": 150.95, "data_gap_recovery_extended_stateful": 88.53, "argument_transformation_stateful": 129.85, "grounded_synthesis_stateful": 345.05, "inconsistent_api_recovery_stateful": 491.27}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 18, "data_gap_recovery_extended": 8, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "mistral-nemo:12b-instruct-2407-q4_K_M OL/N [reforged]", "model": "mistral-nemo:12b-instruct-2407-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 34.2, "accuracy": 44.9, "completeness": 76.0, "efficiency": 48.7, "wasted": 3.4, "speed": 7.9, "n": 50, "scenarios": {"relevance_detection": 46, "argument_fidelity": 14, "tool_selection": 48, "basic_2step": 98, "sequential_3step": 28, "conditional_routing": 34, "sequential_reasoning": 50, "error_recovery": 44, "data_gap_recovery": 82, "data_gap_recovery_extended": 12, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 22, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 8, "tool_selection_stateful": 56, "basic_2step_stateful": 100, "sequential_3step_stateful": 28, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 64, "error_recovery_stateful": 34, "data_gap_recovery_stateful": 78, "data_gap_recovery_extended_stateful": 6, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 23, "argument_fidelity": 7, "tool_selection": 24, "basic_2step": 49, "sequential_3step": 14, "conditional_routing": 17, "sequential_reasoning": 25, "error_recovery": 22, "data_gap_recovery": 41, "data_gap_recovery_extended": 6, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 4, "tool_selection_stateful": 28, "basic_2step_stateful": 50, "sequential_3step_stateful": 14, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 32, "error_recovery_stateful": 17, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 23, "argument_fidelity": 27, "tool_selection": 31, "basic_2step": 49, "sequential_3step": 37, "conditional_routing": 21, "sequential_reasoning": 39, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 46, "argument_transformation": 32, "grounded_synthesis": 40, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 28, "tool_selection_stateful": 40, "basic_2step_stateful": 50, "sequential_3step_stateful": 38, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 42}, "scenarioValidated": {"relevance_detection": 23, "argument_fidelity": 27, "tool_selection": 31, "basic_2step": 49, "sequential_3step": 37, "conditional_routing": 21, "sequential_reasoning": 39, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 46, "argument_transformation": 32, "grounded_synthesis": 40, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 28, "tool_selection_stateful": 40, "basic_2step_stateful": 50, "sequential_3step_stateful": 38, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 42}, "scenarioIdealCalls": {"relevance_detection": 23, "argument_fidelity": 21, "tool_selection": 72, "basic_2step": 98, "sequential_3step": 42, "conditional_routing": 68, "sequential_reasoning": 100, "error_recovery": 44, "data_gap_recovery": 205, "data_gap_recovery_extended": 48, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 88, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 12, "tool_selection_stateful": 84, "basic_2step_stateful": 100, "sequential_3step_stateful": 42, "conditional_routing_stateful": 64, "sequential_reasoning_stateful": 128, "error_recovery_stateful": 51, "data_gap_recovery_stateful": 195, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 47, "tool_selection": 162, "basic_2step": 221, "sequential_3step": 103, "conditional_routing": 184, "sequential_reasoning": 194, "error_recovery": 113, "data_gap_recovery": 349, "data_gap_recovery_extended": 89, "argument_transformation": 0, "grounded_synthesis": 19, "inconsistent_api_recovery": 167, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 26, "tool_selection_stateful": 198, "basic_2step_stateful": 172, "sequential_3step_stateful": 107, "conditional_routing_stateful": 189, "sequential_reasoning_stateful": 233, "error_recovery_stateful": 87, "data_gap_recovery_stateful": 321, "data_gap_recovery_extended_stateful": 37, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 77.0, "argument_fidelity": 151.0, "tool_selection": 109.0, "basic_2step": 123.0, "sequential_3step": 154.0, "conditional_routing": 140.0, "sequential_reasoning": 140.0, "error_recovery": 160.0, "data_gap_recovery": 166.0, "data_gap_recovery_extended": 192.0, "argument_transformation": 28.0, "grounded_synthesis": 57.0, "inconsistent_api_recovery": 282.0, "relevance_detection_stateful": 87.0, "argument_fidelity_stateful": 151.0, "tool_selection_stateful": 142.0, "basic_2step_stateful": 72.0, "sequential_3step_stateful": 151.0, "conditional_routing_stateful": 164.0, "sequential_reasoning_stateful": 132.0, "error_recovery_stateful": 119.0, "data_gap_recovery_stateful": 141.0, "data_gap_recovery_extended_stateful": 160.0, "argument_transformation_stateful": 50.0, "grounded_synthesis_stateful": 41.0, "inconsistent_api_recovery_stateful": 217.0}, "scenarioWastedN": {"relevance_detection": 23, "argument_fidelity": 27, "tool_selection": 31, "basic_2step": 49, "sequential_3step": 37, "conditional_routing": 21, "sequential_reasoning": 39, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 46, "argument_transformation": 32, "grounded_synthesis": 40, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 28, "tool_selection_stateful": 40, "basic_2step_stateful": 50, "sequential_3step_stateful": 38, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 42}, "scenarioSpeedSum": {"relevance_detection": 56.93, "argument_fidelity": 117.8, "tool_selection": 182.16, "basic_2step": 93.28, "sequential_3step": 320.16, "conditional_routing": 314.28, "sequential_reasoning": 242.33, "error_recovery": 239.99, "data_gap_recovery": 433.57, "data_gap_recovery_extended": 582.79, "argument_transformation": 218.44, "grounded_synthesis": 668.38, "inconsistent_api_recovery": 481.14, "relevance_detection_stateful": 70.58, "argument_fidelity_stateful": 126.07, "tool_selection_stateful": 216.67, "basic_2step_stateful": 81.48, "sequential_3step_stateful": 276.19, "conditional_routing_stateful": 317.28, "sequential_reasoning_stateful": 213.79, "error_recovery_stateful": 287.49, "data_gap_recovery_stateful": 402.25, "data_gap_recovery_extended_stateful": 483.4, "argument_transformation_stateful": 252.0, "grounded_synthesis_stateful": 679.32, "inconsistent_api_recovery_stateful": 424.44}, "scenarioSpeedN": {"relevance_detection": 23, "argument_fidelity": 27, "tool_selection": 31, "basic_2step": 49, "sequential_3step": 37, "conditional_routing": 21, "sequential_reasoning": 39, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 46, "argument_transformation": 32, "grounded_synthesis": 40, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 28, "tool_selection_stateful": 40, "basic_2step_stateful": 50, "sequential_3step_stateful": 38, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 42}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q4_K_M LS/P [bare]", "model": "Meta-Llama-3.1-8B-Instruct-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 33.0, "accuracy": 38.5, "completeness": 85.7, "efficiency": 100.0, "wasted": 0.2, "speed": 1.3, "n": 50, "scenarios": {"relevance_detection": 68, "argument_fidelity": 80, "tool_selection": 82, "basic_2step": 100, "sequential_3step": 22, "conditional_routing": 50, "sequential_reasoning": 8, "error_recovery": 0, "data_gap_recovery": 24, "data_gap_recovery_extended": 20, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 64, "argument_fidelity_stateful": 84, "tool_selection_stateful": 80, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 34, "argument_fidelity": 40, "tool_selection": 41, "basic_2step": 50, "sequential_3step": 11, "conditional_routing": 25, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 12, "data_gap_recovery_extended": 10, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 1, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 42, "tool_selection_stateful": 40, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 24, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 8, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 34, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 34, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 34, "argument_fidelity": 120, "tool_selection": 123, "basic_2step": 100, "sequential_3step": 33, "conditional_routing": 100, "sequential_reasoning": 16, "error_recovery": 0, "data_gap_recovery": 60, "data_gap_recovery_extended": 80, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 126, "tool_selection_stateful": 120, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 34, "argument_fidelity": 120, "tool_selection": 96, "basic_2step": 100, "sequential_3step": 33, "conditional_routing": 119, "sequential_reasoning": 10, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 39, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 3, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 126, "tool_selection_stateful": 125, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 120, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 38, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 24.0, "sequential_reasoning": 20.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 5.0, "argument_transformation": 12.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 5.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 3.0, "conditional_routing_stateful": 25.0, "sequential_reasoning_stateful": 55.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 20.0, "data_gap_recovery_extended_stateful": 6.0, "argument_transformation_stateful": 20.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 34, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 6.38, "argument_fidelity": 32.01, "tool_selection": 27.74, "basic_2step": 18.04, "sequential_3step": 20.88, "conditional_routing": 51.23, "sequential_reasoning": 42.64, "error_recovery": 0.0, "data_gap_recovery": 49.28, "data_gap_recovery_extended": 84.08, "argument_transformation": 167.19, "grounded_synthesis": 145.91, "inconsistent_api_recovery": 78.02, "relevance_detection_stateful": 5.82, "argument_fidelity_stateful": 29.06, "tool_selection_stateful": 33.82, "basic_2step_stateful": 20.15, "sequential_3step_stateful": 18.59, "conditional_routing_stateful": 46.63, "sequential_reasoning_stateful": 53.54, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 69.39, "data_gap_recovery_extended_stateful": 82.25, "argument_transformation_stateful": 115.27, "grounded_synthesis_stateful": 143.85, "inconsistent_api_recovery_stateful": 80.45}, "scenarioSpeedN": {"relevance_detection": 34, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Instruct-2512-Q8_0 LS/N [bare]", "model": "Ministral-3-8B-Instruct-2512-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "ministral-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 32.5, "accuracy": 62.0, "completeness": 52.5, "efficiency": 100.0, "wasted": 0.0, "speed": 4.4, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 66, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 0, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 74, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 33, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 33, "data_gap_recovery_extended": 7, "argument_transformation": 48, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 33, "data_gap_recovery_extended": 7, "argument_transformation": 48, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 165, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 0, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 185, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 150, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 132, "data_gap_recovery_extended": 0, "argument_transformation": 6, "grounded_synthesis": 0, "inconsistent_api_recovery": 250, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 150, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 148, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 33, "data_gap_recovery_extended": 7, "argument_transformation": 48, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 58.89, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 37.55, "conditional_routing": 144.4, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 178.1, "data_gap_recovery_extended": 37.43, "argument_transformation": 545.75, "grounded_synthesis": 293.46, "inconsistent_api_recovery": 188.86, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 58.77, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 37.54, "conditional_routing_stateful": 168.45, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 199.72, "data_gap_recovery_extended_stateful": 55.84, "argument_transformation_stateful": 529.23, "grounded_synthesis_stateful": 305.33, "inconsistent_api_recovery_stateful": 188.75}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 33, "data_gap_recovery_extended": 7, "argument_transformation": 48, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q8_0 LS/P [bare]", "model": "Meta-Llama-3.1-8B-Instruct-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 31.5, "accuracy": 36.3, "completeness": 86.8, "efficiency": 100.0, "wasted": 0.2, "speed": 2.1, "n": 50, "scenarios": {"relevance_detection": 78, "argument_fidelity": 94, "tool_selection": 72, "basic_2step": 100, "sequential_3step": 6, "conditional_routing": 22, "sequential_reasoning": 2, "error_recovery": 0, "data_gap_recovery": 24, "data_gap_recovery_extended": 12, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 92, "argument_fidelity_stateful": 86, "tool_selection_stateful": 66, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 26, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 39, "argument_fidelity": 47, "tool_selection": 36, "basic_2step": 50, "sequential_3step": 3, "conditional_routing": 11, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 12, "data_gap_recovery_extended": 6, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 43, "tool_selection_stateful": 33, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 13, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 39, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 44, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 39, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 44, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 39, "argument_fidelity": 141, "tool_selection": 108, "basic_2step": 100, "sequential_3step": 9, "conditional_routing": 44, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 60, "data_gap_recovery_extended": 48, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 129, "tool_selection_stateful": 99, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 52, "sequential_reasoning_stateful": 8, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 55, "data_gap_recovery_extended_stateful": 56, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 39, "argument_fidelity": 141, "tool_selection": 75, "basic_2step": 93, "sequential_3step": 8, "conditional_routing": 56, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 38, "data_gap_recovery_extended": 30, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 130, "tool_selection_stateful": 103, "basic_2step_stateful": 92, "sequential_3step_stateful": 0, "conditional_routing_stateful": 65, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 41, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 12.0, "sequential_reasoning": 17.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 8.0, "argument_transformation": 27.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 3.0, "tool_selection_stateful": 5.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 10.0, "conditional_routing_stateful": 13.0, "sequential_reasoning_stateful": 39.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 20.0, "data_gap_recovery_extended_stateful": 9.0, "argument_transformation_stateful": 37.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 39, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 44, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 10.81, "argument_fidelity": 68.01, "tool_selection": 48.61, "basic_2step": 27.89, "sequential_3step": 30.97, "conditional_routing": 90.13, "sequential_reasoning": 93.01, "error_recovery": 0.0, "data_gap_recovery": 84.15, "data_gap_recovery_extended": 118.23, "argument_transformation": 239.6, "grounded_synthesis": 208.57, "inconsistent_api_recovery": 123.12, "relevance_detection_stateful": 12.87, "argument_fidelity_stateful": 51.28, "tool_selection_stateful": 58.2, "basic_2step_stateful": 29.58, "sequential_3step_stateful": 33.61, "conditional_routing_stateful": 66.16, "sequential_reasoning_stateful": 81.45, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 104.99, "data_gap_recovery_extended_stateful": 170.91, "argument_transformation_stateful": 295.76, "grounded_synthesis_stateful": 209.04, "inconsistent_api_recovery_stateful": 129.5}, "scenarioSpeedN": {"relevance_detection": 39, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 44, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "ministral-3:14b-instruct-2512-q4_K_M OL/N [bare]", "model": "ministral-3:14b-instruct-2512-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "ministral-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 32.0, "accuracy": 84.4, "completeness": 37.9, "efficiency": 100.0, "wasted": 0.1, "speed": 3.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 30, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 15, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 15, "grounded_synthesis": 6, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 16}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 15, "grounded_synthesis": 6, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 16}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 120, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 28.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 40.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 15, "grounded_synthesis": 6, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 16}, "scenarioSpeedSum": {"relevance_detection": 22.68, "argument_fidelity": 72.87, "tool_selection": 45.81, "basic_2step": 0.0, "sequential_3step": 92.07, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 219.68, "grounded_synthesis": 90.02, "inconsistent_api_recovery": 137.73, "relevance_detection_stateful": 22.95, "argument_fidelity_stateful": 72.74, "tool_selection_stateful": 45.7, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 117.96, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 4.92, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 306.66, "grounded_synthesis_stateful": 272.89, "inconsistent_api_recovery_stateful": 119.49}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 15, "grounded_synthesis": 6, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 16}}, {"label": "Meta-Llama-3.1-8B-Instruct.Q8_0 LF/P [bare]", "model": "Meta-Llama-3.1-8B-Instruct.Q8_0", "backend": "llamafile", "mode": "prompt", "ablation": "bare", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 29.8, "accuracy": 34.6, "completeness": 86.3, "efficiency": 100.0, "wasted": 0.2, "speed": 3.1, "n": 50, "scenarios": {"relevance_detection": 98, "argument_fidelity": 82, "tool_selection": 22, "basic_2step": 100, "sequential_3step": 12, "conditional_routing": 48, "sequential_reasoning": 6, "error_recovery": 0, "data_gap_recovery": 28, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 72, "tool_selection_stateful": 14, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 49, "argument_fidelity": 41, "tool_selection": 11, "basic_2step": 50, "sequential_3step": 6, "conditional_routing": 24, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 14, "data_gap_recovery_extended": 4, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 36, "tool_selection_stateful": 7, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 23, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 47, "argument_transformation": 23, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 47, "argument_transformation": 23, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 49, "argument_fidelity": 123, "tool_selection": 33, "basic_2step": 100, "sequential_3step": 18, "conditional_routing": 96, "sequential_reasoning": 12, "error_recovery": 0, "data_gap_recovery": 70, "data_gap_recovery_extended": 32, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 108, "tool_selection_stateful": 21, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 92, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 75, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 49, "argument_fidelity": 123, "tool_selection": 30, "basic_2step": 100, "sequential_3step": 18, "conditional_routing": 114, "sequential_reasoning": 8, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 34, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 109, "tool_selection_stateful": 21, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 112, "sequential_reasoning_stateful": 6, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 54, "data_gap_recovery_extended_stateful": 26, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 20.0, "sequential_reasoning": 4.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 9.0, "argument_transformation": 88.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 1.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 6.0, "conditional_routing_stateful": 22.0, "sequential_reasoning_stateful": 39.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 9.0, "argument_transformation_stateful": 60.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 47, "argument_transformation": 23, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 25.18, "argument_fidelity": 82.94, "tool_selection": 59.55, "basic_2step": 50.54, "sequential_3step": 56.69, "conditional_routing": 122.34, "sequential_reasoning": 109.92, "error_recovery": 0.0, "data_gap_recovery": 140.71, "data_gap_recovery_extended": 206.29, "argument_transformation": 358.24, "grounded_synthesis": 349.71, "inconsistent_api_recovery": 199.46, "relevance_detection_stateful": 26.05, "argument_fidelity_stateful": 85.06, "tool_selection_stateful": 60.87, "basic_2step_stateful": 56.08, "sequential_3step_stateful": 56.3, "conditional_routing_stateful": 125.12, "sequential_reasoning_stateful": 140.72, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 138.89, "data_gap_recovery_extended_stateful": 204.91, "argument_transformation_stateful": 242.59, "grounded_synthesis_stateful": 337.19, "inconsistent_api_recovery_stateful": 199.35}, "scenarioSpeedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 47, "argument_transformation": 23, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-14B-Q4_K_M LS/N [bare]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 28.7, "accuracy": 50.1, "completeness": 57.2, "efficiency": 100.0, "wasted": 0.0, "speed": 17.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 4, "tool_selection": 0, "basic_2step": 24, "sequential_3step": 46, "conditional_routing": 62, "sequential_reasoning": 30, "error_recovery": 0, "data_gap_recovery": 20, "data_gap_recovery_extended": 6, "argument_transformation": 10, "grounded_synthesis": 44, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 12, "tool_selection_stateful": 8, "basic_2step_stateful": 54, "sequential_3step_stateful": 54, "conditional_routing_stateful": 68, "sequential_reasoning_stateful": 34, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 20, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 2, "tool_selection": 0, "basic_2step": 12, "sequential_3step": 23, "conditional_routing": 31, "sequential_reasoning": 15, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 3, "argument_transformation": 5, "grounded_synthesis": 22, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 6, "tool_selection_stateful": 4, "basic_2step_stateful": 27, "sequential_3step_stateful": 27, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 17, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 2, "tool_selection": 2, "basic_2step": 12, "sequential_3step": 43, "conditional_routing": 35, "sequential_reasoning": 15, "error_recovery": 0, "data_gap_recovery": 32, "data_gap_recovery_extended": 42, "argument_transformation": 28, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 6, "tool_selection_stateful": 8, "basic_2step_stateful": 27, "sequential_3step_stateful": 45, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 18, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 26, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 25, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 2, "tool_selection": 2, "basic_2step": 12, "sequential_3step": 43, "conditional_routing": 35, "sequential_reasoning": 15, "error_recovery": 0, "data_gap_recovery": 32, "data_gap_recovery_extended": 42, "argument_transformation": 28, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 6, "tool_selection_stateful": 8, "basic_2step_stateful": 27, "sequential_3step_stateful": 45, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 18, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 26, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 25, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 6, "tool_selection": 0, "basic_2step": 24, "sequential_3step": 69, "conditional_routing": 124, "sequential_reasoning": 60, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 24, "argument_transformation": 25, "grounded_synthesis": 220, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 18, "tool_selection_stateful": 12, "basic_2step_stateful": 54, "sequential_3step_stateful": 81, "conditional_routing_stateful": 136, "sequential_reasoning_stateful": 68, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 56, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 160, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 6, "tool_selection": 0, "basic_2step": 12, "sequential_3step": 69, "conditional_routing": 100, "sequential_reasoning": 60, "error_recovery": 0, "data_gap_recovery": 34, "data_gap_recovery_extended": 11, "argument_transformation": 26, "grounded_synthesis": 81, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 18, "tool_selection_stateful": 12, "basic_2step_stateful": 49, "sequential_3step_stateful": 81, "conditional_routing_stateful": 118, "sequential_reasoning_stateful": 68, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 35, "data_gap_recovery_extended_stateful": 25, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 73, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 6.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 1.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 16.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 2, "tool_selection": 2, "basic_2step": 12, "sequential_3step": 43, "conditional_routing": 35, "sequential_reasoning": 15, "error_recovery": 0, "data_gap_recovery": 32, "data_gap_recovery_extended": 42, "argument_transformation": 28, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 6, "tool_selection_stateful": 8, "basic_2step_stateful": 27, "sequential_3step_stateful": 45, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 18, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 26, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 25, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 87.08, "argument_fidelity": 13.18, "tool_selection": 5.58, "basic_2step": 23.03, "sequential_3step": 316.4, "conditional_routing": 576.56, "sequential_reasoning": 248.41, "error_recovery": 0.0, "data_gap_recovery": 426.93, "data_gap_recovery_extended": 792.68, "argument_transformation": 1242.69, "grounded_synthesis": 1796.29, "inconsistent_api_recovery": 688.51, "relevance_detection_stateful": 103.93, "argument_fidelity_stateful": 36.78, "tool_selection_stateful": 35.21, "basic_2step_stateful": 87.2, "sequential_3step_stateful": 318.61, "conditional_routing_stateful": 716.26, "sequential_reasoning_stateful": 310.02, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 362.74, "data_gap_recovery_extended_stateful": 851.34, "argument_transformation_stateful": 1329.24, "grounded_synthesis_stateful": 1716.34, "inconsistent_api_recovery_stateful": 738.6}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 2, "tool_selection": 2, "basic_2step": 12, "sequential_3step": 43, "conditional_routing": 35, "sequential_reasoning": 15, "error_recovery": 0, "data_gap_recovery": 32, "data_gap_recovery_extended": 42, "argument_transformation": 28, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 6, "tool_selection_stateful": 8, "basic_2step_stateful": 27, "sequential_3step_stateful": 45, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 18, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 26, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 25, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [bare]", "model": "Ministral-3-14B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "ministral-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 27.5, "accuracy": 80.8, "completeness": 34.1, "efficiency": 100.0, "wasted": 0.0, "speed": 3.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 8, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 10, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 7, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 10, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 7, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 25, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 20, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 150, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 20, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 250, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 147, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 10, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 7, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 51.86, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 37.12, "conditional_routing": 147.09, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 14.73, "data_gap_recovery_extended": 0.0, "argument_transformation": 82.9, "grounded_synthesis": 140.28, "inconsistent_api_recovery": 185.93, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 51.8, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 36.95, "conditional_routing_stateful": 168.82, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 12.68, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 58.77, "grounded_synthesis_stateful": 136.38, "inconsistent_api_recovery_stateful": 185.83}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 10, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 7, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.0-h-micro-Q4_K_M LS/P [reforged]", "model": "granite-4.0-h-micro-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "family": "granite-4.0-h-micro", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 26.9, "accuracy": 50.0, "completeness": 53.8, "efficiency": 81.6, "wasted": 0.4, "speed": 2.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 250, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 200, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 250, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 203, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 100.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 50.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 100.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 53.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 25.7, "argument_fidelity": 151.34, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 176.38, "conditional_routing": 136.79, "sequential_reasoning": 97.11, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 126.94, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 313.8, "relevance_detection_stateful": 23.99, "argument_fidelity_stateful": 147.54, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 188.05, "conditional_routing_stateful": 154.98, "sequential_reasoning_stateful": 72.57, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 125.42, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 310.44}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [bare]", "model": "Ministral-3-8B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "ministral-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 27.3, "accuracy": 61.4, "completeness": 44.5, "efficiency": 100.0, "wasted": 0.0, "speed": 3.8, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 8, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 41, "grounded_synthesis": 46, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 41, "grounded_synthesis": 46, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 20, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 150, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 16, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 250, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 150, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 41, "grounded_synthesis": 46, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 35.58, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 65.44, "conditional_routing": 100.85, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 16.5, "data_gap_recovery_extended": 0.0, "argument_transformation": 372.67, "grounded_synthesis": 398.69, "inconsistent_api_recovery": 125.34, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 35.37, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 61.14, "conditional_routing_stateful": 102.37, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 4.2, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 385.63, "grounded_synthesis_stateful": 379.81, "inconsistent_api_recovery_stateful": 125.16}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 41, "grounded_synthesis": 46, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.0:h-micro-q4_K_M OL/N [bare]", "model": "granite-4.0:h-micro-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "granite-4.0-h-micro", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 23.1, "accuracy": 50.0, "completeness": 46.2, "efficiency": 100.0, "wasted": 2.5, "speed": 5.3, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 350.0, "grounded_synthesis": 392.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 350.0, "grounded_synthesis_stateful": 392.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 97.1, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 80.12, "conditional_routing": 0.0, "sequential_reasoning": 120.21, "error_recovery": 0.0, "data_gap_recovery": 7.26, "data_gap_recovery_extended": 0.0, "argument_transformation": 419.95, "grounded_synthesis": 653.26, "inconsistent_api_recovery": 227.45, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 97.66, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 79.56, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 120.66, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 7.27, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 418.69, "grounded_synthesis_stateful": 653.94, "inconsistent_api_recovery_stateful": 226.84}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-7B-Instruct-v0.3.Q4_K_M LF/P [bare]", "model": "Mistral-7B-Instruct-v0.3.Q4_K_M", "backend": "llamafile", "mode": "prompt", "ablation": "bare", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 20.2, "accuracy": 22.6, "completeness": 89.7, "efficiency": 100.0, "wasted": 0.0, "speed": 2.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 36, "sequential_3step": 10, "conditional_routing": 10, "sequential_reasoning": 74, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 46, "sequential_3step_stateful": 4, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 24, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 50, "basic_2step": 18, "sequential_3step": 5, "conditional_routing": 5, "sequential_reasoning": 37, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 23, "sequential_3step_stateful": 2, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 12, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 150, "basic_2step": 36, "sequential_3step": 15, "conditional_routing": 20, "sequential_reasoning": 148, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 46, "sequential_3step_stateful": 6, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 21, "sequential_3step": 10, "conditional_routing": 10, "sequential_reasoning": 74, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 32, "sequential_3step_stateful": 4, "conditional_routing_stateful": 24, "sequential_reasoning_stateful": 68, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 26.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 30.79, "argument_fidelity": 55.41, "tool_selection": 57.1, "basic_2step": 40.02, "sequential_3step": 64.78, "conditional_routing": 122.38, "sequential_reasoning": 114.29, "error_recovery": 0.0, "data_gap_recovery": 133.9, "data_gap_recovery_extended": 190.95, "argument_transformation": 285.05, "grounded_synthesis": 317.07, "inconsistent_api_recovery": 228.95, "relevance_detection_stateful": 29.9, "argument_fidelity_stateful": 56.13, "tool_selection_stateful": 57.92, "basic_2step_stateful": 48.67, "sequential_3step_stateful": 58.34, "conditional_routing_stateful": 140.87, "sequential_reasoning_stateful": 121.56, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 123.49, "data_gap_recovery_extended_stateful": 189.64, "argument_transformation_stateful": 306.09, "grounded_synthesis_stateful": 324.61, "inconsistent_api_recovery_stateful": 225.5}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-7B-Instruct-v0.3.Q8_0 LF/P [bare]", "model": "Mistral-7B-Instruct-v0.3.Q8_0", "backend": "llamafile", "mode": "prompt", "ablation": "bare", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 17.8, "accuracy": 19.6, "completeness": 91.0, "efficiency": 100.0, "wasted": 0.0, "speed": 4.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 16, "sequential_3step": 2, "conditional_routing": 30, "sequential_reasoning": 66, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 10, "sequential_3step_stateful": 0, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 50, "basic_2step": 8, "sequential_3step": 1, "conditional_routing": 15, "sequential_reasoning": 33, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 1, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 5, "sequential_3step_stateful": 0, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 150, "basic_2step": 16, "sequential_3step": 3, "conditional_routing": 60, "sequential_reasoning": 132, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 10, "sequential_3step_stateful": 0, "conditional_routing_stateful": 64, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 8, "sequential_3step": 2, "conditional_routing": 30, "sequential_reasoning": 66, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 5, "sequential_3step_stateful": 0, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 38.34, "argument_fidelity": 74.39, "tool_selection": 74.94, "basic_2step": 42.95, "sequential_3step": 72.07, "conditional_routing": 164.86, "sequential_reasoning": 151.28, "error_recovery": 0.0, "data_gap_recovery": 197.88, "data_gap_recovery_extended": 273.33, "argument_transformation": 645.87, "grounded_synthesis": 471.26, "inconsistent_api_recovery": 291.14, "relevance_detection_stateful": 39.81, "argument_fidelity_stateful": 74.38, "tool_selection_stateful": 76.57, "basic_2step_stateful": 41.32, "sequential_3step_stateful": 71.47, "conditional_routing_stateful": 169.31, "sequential_reasoning_stateful": 123.13, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 222.38, "data_gap_recovery_extended_stateful": 289.34, "argument_transformation_stateful": 593.36, "grounded_synthesis_stateful": 435.84, "inconsistent_api_recovery_stateful": 292.39}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-7B-Instruct-v0.3-Q4_K_M LS/P [bare]", "model": "Mistral-7B-Instruct-v0.3-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 18.1, "accuracy": 20.1, "completeness": 90.0, "efficiency": 100.0, "wasted": 0.0, "speed": 1.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 30, "sequential_3step": 0, "conditional_routing": 4, "sequential_reasoning": 66, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 2, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 56, "sequential_3step_stateful": 0, "conditional_routing_stateful": 4, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 50, "basic_2step": 15, "sequential_3step": 0, "conditional_routing": 2, "sequential_reasoning": 33, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 1, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 28, "sequential_3step_stateful": 0, "conditional_routing_stateful": 2, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 46, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 45, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 37, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 46, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 45, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 37, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 150, "basic_2step": 30, "sequential_3step": 0, "conditional_routing": 8, "sequential_reasoning": 132, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 56, "sequential_3step_stateful": 0, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 15, "sequential_3step": 0, "conditional_routing": 10, "sequential_reasoning": 66, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 2, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 28, "sequential_3step_stateful": 0, "conditional_routing_stateful": 7, "sequential_reasoning_stateful": 7, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 2.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 1.0, "sequential_reasoning_stateful": 3.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 46, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 45, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 37, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 13.37, "argument_fidelity": 26.41, "tool_selection": 26.71, "basic_2step": 16.38, "sequential_3step": 25.22, "conditional_routing": 59.84, "sequential_reasoning": 57.18, "error_recovery": 0.0, "data_gap_recovery": 67.48, "data_gap_recovery_extended": 96.99, "argument_transformation": 196.13, "grounded_synthesis": 179.88, "inconsistent_api_recovery": 121.95, "relevance_detection_stateful": 13.9, "argument_fidelity_stateful": 26.76, "tool_selection_stateful": 28.12, "basic_2step_stateful": 16.73, "sequential_3step_stateful": 24.96, "conditional_routing_stateful": 52.97, "sequential_reasoning_stateful": 43.99, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 64.86, "data_gap_recovery_extended_stateful": 77.11, "argument_transformation_stateful": 428.62, "grounded_synthesis_stateful": 172.74, "inconsistent_api_recovery_stateful": 126.46}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 46, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 45, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 37, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-7B-Instruct-v0.3-Q8_0 LS/P [bare]", "model": "Mistral-7B-Instruct-v0.3-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 17.8, "accuracy": 19.6, "completeness": 90.5, "efficiency": 100.0, "wasted": 0.0, "speed": 2.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 2, "sequential_3step": 2, "conditional_routing": 40, "sequential_reasoning": 62, "error_recovery": 0, "data_gap_recovery": 8, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 6, "sequential_3step_stateful": 2, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 50, "basic_2step": 1, "sequential_3step": 1, "conditional_routing": 20, "sequential_reasoning": 31, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 4, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 3, "sequential_3step_stateful": 1, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 43, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 43, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 150, "basic_2step": 2, "sequential_3step": 3, "conditional_routing": 80, "sequential_reasoning": 124, "error_recovery": 0, "data_gap_recovery": 20, "data_gap_recovery_extended": 32, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 6, "sequential_3step_stateful": 3, "conditional_routing_stateful": 64, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 1, "sequential_3step": 2, "conditional_routing": 40, "sequential_reasoning": 62, "error_recovery": 0, "data_gap_recovery": 8, "data_gap_recovery_extended": 9, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 3, "sequential_3step_stateful": 2, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 43, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 22.55, "argument_fidelity": 42.13, "tool_selection": 41.43, "basic_2step": 22.49, "sequential_3step": 55.55, "conditional_routing": 102.42, "sequential_reasoning": 88.78, "error_recovery": 0.0, "data_gap_recovery": 125.42, "data_gap_recovery_extended": 181.21, "argument_transformation": 383.24, "grounded_synthesis": 231.36, "inconsistent_api_recovery": 167.09, "relevance_detection_stateful": 22.17, "argument_fidelity_stateful": 42.66, "tool_selection_stateful": 42.21, "basic_2step_stateful": 23.24, "sequential_3step_stateful": 39.86, "conditional_routing_stateful": 94.82, "sequential_reasoning_stateful": 69.21, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 144.37, "data_gap_recovery_extended_stateful": 132.23, "argument_transformation_stateful": 338.66, "grounded_synthesis_stateful": 240.48, "inconsistent_api_recovery_stateful": 173.39}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 43, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "ministral-3:8b-instruct-2512-q8_0 OL/N [bare]", "model": "ministral-3:8b-instruct-2512-q8_0", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "ministral-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 17.8, "accuracy": 49.6, "completeness": 36.0, "efficiency": 93.3, "wasted": 0.4, "speed": 6.8, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 68, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 14, "data_gap_recovery_extended": 0, "argument_transformation": 20, "grounded_synthesis": 0, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 64, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 34, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 0, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 32, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 34, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 44, "argument_transformation": 16, "grounded_synthesis": 39, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 32, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 17, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 34, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 44, "argument_transformation": 16, "grounded_synthesis": 39, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 32, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 17, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 102, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 35, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 384, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 96, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 102, "conditional_routing": 248, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 21, "data_gap_recovery_extended": 0, "argument_transformation": 92, "grounded_synthesis": 0, "inconsistent_api_recovery": 334, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 96, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 48.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 48.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 8.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 34.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 34, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 44, "argument_transformation": 16, "grounded_synthesis": 39, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 32, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 17, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 32.28, "conditional_routing": 198.33, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 42.63, "data_gap_recovery_extended": 357.48, "argument_transformation": 182.43, "grounded_synthesis": 418.92, "inconsistent_api_recovery": 384.02, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 30.3, "conditional_routing_stateful": 194.38, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 314.74, "argument_transformation_stateful": 172.2, "grounded_synthesis_stateful": 459.71, "inconsistent_api_recovery_stateful": 386.94}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 34, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 44, "argument_transformation": 16, "grounded_synthesis": 39, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 32, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 17, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.0:h-tiny-q4_K_M OL/N [bare]", "model": "granite-4.0:h-tiny-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "granite-4.0-h-tiny", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 15.5, "accuracy": 33.2, "completeness": 46.8, "efficiency": 100.0, "wasted": 1.0, "speed": 3.8, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 4, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 2, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 8, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 8, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 112.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 188.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 87.34, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 81.1, "conditional_routing": 158.74, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 6.4, "data_gap_recovery_extended": 0.0, "argument_transformation": 130.11, "grounded_synthesis": 346.43, "inconsistent_api_recovery": 330.28, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 87.09, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 80.82, "conditional_routing_stateful": 158.96, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 12.84, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 127.4, "grounded_synthesis_stateful": 353.58, "inconsistent_api_recovery_stateful": 378.46}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-27B-Q4_K_M LS/N [bare]", "model": "Qwen3.5-27B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "qwen3.5-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 15.8, "accuracy": 100.0, "completeness": 15.8, "efficiency": 100.0, "wasted": 0.0, "speed": 11.0, "n": 50, "scenarios": {"relevance_detection": 88, "argument_fidelity": 0, "tool_selection": 12, "basic_2step": 30, "sequential_3step": 2, "conditional_routing": 32, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 16, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 96, "argument_fidelity_stateful": 0, "tool_selection_stateful": 24, "basic_2step_stateful": 56, "sequential_3step_stateful": 4, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 44, "argument_fidelity": 0, "tool_selection": 6, "basic_2step": 15, "sequential_3step": 1, "conditional_routing": 16, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 0, "tool_selection_stateful": 12, "basic_2step_stateful": 28, "sequential_3step_stateful": 2, "conditional_routing_stateful": 20, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 44, "argument_fidelity": 0, "tool_selection": 6, "basic_2step": 15, "sequential_3step": 1, "conditional_routing": 16, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 0, "tool_selection_stateful": 12, "basic_2step_stateful": 28, "sequential_3step_stateful": 2, "conditional_routing_stateful": 20, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 44, "argument_fidelity": 0, "tool_selection": 6, "basic_2step": 15, "sequential_3step": 1, "conditional_routing": 16, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 0, "tool_selection_stateful": 12, "basic_2step_stateful": 28, "sequential_3step_stateful": 2, "conditional_routing_stateful": 20, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 44, "argument_fidelity": 0, "tool_selection": 18, "basic_2step": 30, "sequential_3step": 3, "conditional_routing": 64, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 80, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 0, "tool_selection_stateful": 36, "basic_2step_stateful": 56, "sequential_3step_stateful": 6, "conditional_routing_stateful": 80, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 44, "argument_fidelity": 0, "tool_selection": 18, "basic_2step": 30, "sequential_3step": 3, "conditional_routing": 32, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 24, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 0, "tool_selection_stateful": 36, "basic_2step_stateful": 56, "sequential_3step_stateful": 6, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 9, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 44, "argument_fidelity": 0, "tool_selection": 6, "basic_2step": 15, "sequential_3step": 1, "conditional_routing": 16, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 0, "tool_selection_stateful": 12, "basic_2step_stateful": 28, "sequential_3step_stateful": 2, "conditional_routing_stateful": 20, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 191.39, "argument_fidelity": 0.0, "tool_selection": 38.76, "basic_2step": 66.96, "sequential_3step": 8.19, "conditional_routing": 372.94, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 25.39, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 444.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 229.63, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 80.74, "basic_2step_stateful": 133.43, "sequential_3step_stateful": 17.0, "conditional_routing_stateful": 453.29, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 52.57, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 154.58, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 44, "argument_fidelity": 0, "tool_selection": 6, "basic_2step": 15, "sequential_3step": 1, "conditional_routing": 16, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 0, "tool_selection_stateful": 12, "basic_2step_stateful": 28, "sequential_3step_stateful": 2, "conditional_routing_stateful": 20, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 0}}, {"label": "granite-4.0-h-tiny-Q4_K_M LS/N [bare]", "model": "granite-4.0-h-tiny-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "granite-4.0-h-tiny", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 15.4, "accuracy": 33.3, "completeness": 46.2, "efficiency": 100.0, "wasted": 2.6, "speed": 3.5, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 500.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 147.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 500.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 98.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 56.19, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 47.39, "conditional_routing": 140.43, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 276.63, "grounded_synthesis": 299.06, "inconsistent_api_recovery": 228.38, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 55.94, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 47.0, "conditional_routing_stateful": 139.78, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 280.87, "grounded_synthesis_stateful": 286.68, "inconsistent_api_recovery_stateful": 215.85}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "ministral-3:8b-instruct-2512-q4_K_M OL/N [bare]", "model": "ministral-3:8b-instruct-2512-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "ministral-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 14.2, "accuracy": 45.1, "completeness": 31.4, "efficiency": 90.9, "wasted": 0.2, "speed": 5.2, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 76, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 86, "conditional_routing_stateful": 70, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 43, "conditional_routing_stateful": 35, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 43, "grounded_synthesis": 38, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 33, "inconsistent_api_recovery_stateful": 3}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 43, "grounded_synthesis": 38, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 33, "inconsistent_api_recovery_stateful": 3}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 114, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 129, "conditional_routing_stateful": 140, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 55, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 114, "conditional_routing": 244, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 6, "inconsistent_api_recovery": 26, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 129, "conditional_routing_stateful": 175, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 55, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 44.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 35.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 5.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 43, "grounded_synthesis": 38, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 33, "inconsistent_api_recovery_stateful": 3}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 45.05, "conditional_routing": 276.25, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 20.46, "data_gap_recovery_extended": 0.0, "argument_transformation": 220.4, "grounded_synthesis": 370.1, "inconsistent_api_recovery": 28.25, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 49.9, "conditional_routing_stateful": 220.36, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 70.27, "data_gap_recovery_extended_stateful": 255.32, "argument_transformation_stateful": 221.32, "grounded_synthesis_stateful": 312.31, "inconsistent_api_recovery_stateful": 28.44}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 43, "grounded_synthesis": 38, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 33, "inconsistent_api_recovery_stateful": 3}}, {"label": "granite-4.0-h-micro-Q4_K_M LS/P [bare]", "model": "granite-4.0-h-micro-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "family": "granite-4.0-h-micro", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 11.5, "accuracy": 21.4, "completeness": 53.8, "efficiency": 100.0, "wasted": 0.0, "speed": 1.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 24.28, "argument_fidelity": 38.62, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 66.93, "conditional_routing": 134.76, "sequential_reasoning": 97.67, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 125.96, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 132.54, "relevance_detection_stateful": 23.99, "argument_fidelity_stateful": 38.58, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 66.92, "conditional_routing_stateful": 136.13, "sequential_reasoning_stateful": 72.65, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 125.73, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 132.17}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-35B-A3B-Q4_K_M LS/N [bare]", "model": "Qwen3.5-35B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "qwen3.5-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 12.2, "accuracy": 97.5, "completeness": 12.5, "efficiency": 100.0, "wasted": 0.0, "speed": 3.9, "n": 50, "scenarios": {"relevance_detection": 76, "argument_fidelity": 2, "tool_selection": 0, "basic_2step": 28, "sequential_3step": 8, "conditional_routing": 26, "sequential_reasoning": 6, "error_recovery": 0, "data_gap_recovery": 6, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 70, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 28, "sequential_3step_stateful": 8, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 38, "argument_fidelity": 1, "tool_selection": 0, "basic_2step": 14, "sequential_3step": 4, "conditional_routing": 13, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 14, "sequential_3step_stateful": 4, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 5, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 38, "argument_fidelity": 1, "tool_selection": 0, "basic_2step": 14, "sequential_3step": 4, "conditional_routing": 13, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 3, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 14, "sequential_3step_stateful": 4, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 5, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 38, "argument_fidelity": 1, "tool_selection": 0, "basic_2step": 14, "sequential_3step": 4, "conditional_routing": 13, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 3, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 14, "sequential_3step_stateful": 4, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 5, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 38, "argument_fidelity": 3, "tool_selection": 0, "basic_2step": 28, "sequential_3step": 12, "conditional_routing": 52, "sequential_reasoning": 12, "error_recovery": 0, "data_gap_recovery": 15, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 28, "sequential_3step_stateful": 12, "conditional_routing_stateful": 64, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 38, "argument_fidelity": 3, "tool_selection": 0, "basic_2step": 28, "sequential_3step": 12, "conditional_routing": 40, "sequential_reasoning": 12, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 3, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 28, "sequential_3step_stateful": 12, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 8, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 3.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 3.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 38, "argument_fidelity": 1, "tool_selection": 0, "basic_2step": 14, "sequential_3step": 4, "conditional_routing": 13, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 3, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 14, "sequential_3step_stateful": 4, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 5, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 51.3, "argument_fidelity": 2.5, "tool_selection": 0.0, "basic_2step": 17.65, "sequential_3step": 10.61, "conditional_routing": 105.19, "sequential_reasoning": 10.71, "error_recovery": 0.0, "data_gap_recovery": 25.3, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 52.37, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 46.57, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 18.96, "sequential_3step_stateful": 10.04, "conditional_routing_stateful": 125.63, "sequential_reasoning_stateful": 17.22, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 14.09, "data_gap_recovery_extended_stateful": 7.63, "argument_transformation_stateful": 24.25, "grounded_synthesis_stateful": 95.47, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 38, "argument_fidelity": 1, "tool_selection": 0, "basic_2step": 14, "sequential_3step": 4, "conditional_routing": 13, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 3, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 14, "sequential_3step_stateful": 4, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 5, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}}, {"label": "mistral:7b-instruct-v0.3-q4_K_M OL/N [reforged]", "model": "mistral:7b-instruct-v0.3-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 6.3, "accuracy": 39.6, "completeness": 15.9, "efficiency": 62.6, "wasted": 2.6, "speed": 6.5, "n": 50, "scenarios": {"relevance_detection": 14, "argument_fidelity": 4, "tool_selection": 0, "basic_2step": 12, "sequential_3step": 44, "conditional_routing": 0, "sequential_reasoning": 10, "error_recovery": 2, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 2, "tool_selection_stateful": 0, "basic_2step_stateful": 6, "sequential_3step_stateful": 50, "conditional_routing_stateful": 4, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 4, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 7, "argument_fidelity": 2, "tool_selection": 0, "basic_2step": 6, "sequential_3step": 22, "conditional_routing": 0, "sequential_reasoning": 5, "error_recovery": 1, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 1, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 1, "tool_selection_stateful": 0, "basic_2step_stateful": 3, "sequential_3step_stateful": 25, "conditional_routing_stateful": 2, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 2, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 7, "argument_fidelity": 10, "tool_selection": 0, "basic_2step": 7, "sequential_3step": 28, "conditional_routing": 10, "sequential_reasoning": 11, "error_recovery": 14, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 4, "sequential_3step_stateful": 29, "conditional_routing_stateful": 6, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 12, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 19}, "scenarioValidated": {"relevance_detection": 7, "argument_fidelity": 10, "tool_selection": 0, "basic_2step": 7, "sequential_3step": 28, "conditional_routing": 10, "sequential_reasoning": 11, "error_recovery": 14, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 4, "sequential_3step_stateful": 29, "conditional_routing_stateful": 6, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 12, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 19}, "scenarioIdealCalls": {"relevance_detection": 7, "argument_fidelity": 6, "tool_selection": 0, "basic_2step": 12, "sequential_3step": 66, "conditional_routing": 0, "sequential_reasoning": 20, "error_recovery": 2, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 3, "tool_selection_stateful": 0, "basic_2step_stateful": 6, "sequential_3step_stateful": 75, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 6, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 13, "argument_fidelity": 10, "tool_selection": 0, "basic_2step": 18, "sequential_3step": 101, "conditional_routing": 0, "sequential_reasoning": 30, "error_recovery": 5, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 14, "relevance_detection_stateful": 8, "argument_fidelity_stateful": 4, "tool_selection_stateful": 0, "basic_2step_stateful": 17, "sequential_3step_stateful": 111, "conditional_routing_stateful": 13, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 14, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 6.0, "argument_fidelity": 42.0, "tool_selection": 0.0, "basic_2step": 7.0, "sequential_3step": 45.0, "conditional_routing": 46.0, "sequential_reasoning": 23.0, "error_recovery": 84.0, "data_gap_recovery": 3.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 2.0, "inconsistent_api_recovery": 56.0, "relevance_detection_stateful": 4.0, "argument_fidelity_stateful": 61.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 16.0, "sequential_3step_stateful": 40.0, "conditional_routing_stateful": 15.0, "sequential_reasoning_stateful": 2.0, "error_recovery_stateful": 51.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 2.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 2.0, "inconsistent_api_recovery_stateful": 34.0}, "scenarioWastedN": {"relevance_detection": 7, "argument_fidelity": 10, "tool_selection": 0, "basic_2step": 7, "sequential_3step": 28, "conditional_routing": 10, "sequential_reasoning": 11, "error_recovery": 14, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 4, "sequential_3step_stateful": 29, "conditional_routing_stateful": 6, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 12, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 19}, "scenarioSpeedSum": {"relevance_detection": 5.57, "argument_fidelity": 37.33, "tool_selection": 0.0, "basic_2step": 9.67, "sequential_3step": 196.78, "conditional_routing": 80.48, "sequential_reasoning": 36.64, "error_recovery": 72.39, "data_gap_recovery": 31.36, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 19.6, "inconsistent_api_recovery": 239.08, "relevance_detection_stateful": 5.85, "argument_fidelity_stateful": 40.87, "tool_selection_stateful": 0.0, "basic_2step_stateful": 10.5, "sequential_3step_stateful": 157.16, "conditional_routing_stateful": 39.7, "sequential_reasoning_stateful": 3.43, "error_recovery_stateful": 56.01, "data_gap_recovery_stateful": 11.58, "data_gap_recovery_extended_stateful": 18.14, "argument_transformation_stateful": 24.92, "grounded_synthesis_stateful": 33.24, "inconsistent_api_recovery_stateful": 222.02}, "scenarioSpeedN": {"relevance_detection": 7, "argument_fidelity": 10, "tool_selection": 0, "basic_2step": 7, "sequential_3step": 28, "conditional_routing": 10, "sequential_reasoning": 11, "error_recovery": 14, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 4, "sequential_3step_stateful": 29, "conditional_routing_stateful": 6, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 12, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 19}}, {"label": "mistral:7b-instruct-v0.3-q8_0 OL/N [reforged]", "model": "mistral:7b-instruct-v0.3-q8_0", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 5.7, "accuracy": 42.3, "completeness": 13.5, "efficiency": 62.7, "wasted": 3.2, "speed": 9.4, "n": 50, "scenarios": {"relevance_detection": 14, "argument_fidelity": 2, "tool_selection": 2, "basic_2step": 20, "sequential_3step": 34, "conditional_routing": 4, "sequential_reasoning": 10, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 2, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 44, "conditional_routing_stateful": 6, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 7, "argument_fidelity": 1, "tool_selection": 1, "basic_2step": 10, "sequential_3step": 17, "conditional_routing": 2, "sequential_reasoning": 5, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 1, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 22, "conditional_routing_stateful": 3, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 7, "argument_fidelity": 7, "tool_selection": 1, "basic_2step": 10, "sequential_3step": 21, "conditional_routing": 13, "sequential_reasoning": 12, "error_recovery": 13, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 6, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 8, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 25, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 7}, "scenarioValidated": {"relevance_detection": 7, "argument_fidelity": 7, "tool_selection": 1, "basic_2step": 10, "sequential_3step": 21, "conditional_routing": 13, "sequential_reasoning": 12, "error_recovery": 13, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 6, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 8, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 25, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 7}, "scenarioIdealCalls": {"relevance_detection": 7, "argument_fidelity": 3, "tool_selection": 3, "basic_2step": 20, "sequential_3step": 51, "conditional_routing": 8, "sequential_reasoning": 20, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 3, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 66, "conditional_routing_stateful": 12, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 16, "argument_fidelity": 5, "tool_selection": 6, "basic_2step": 39, "sequential_3step": 72, "conditional_routing": 17, "sequential_reasoning": 31, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 12, "argument_fidelity_stateful": 4, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 91, "conditional_routing_stateful": 23, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 9.0, "argument_fidelity": 55.0, "tool_selection": 3.0, "basic_2step": 19.0, "sequential_3step": 28.0, "conditional_routing": 58.0, "sequential_reasoning": 35.0, "error_recovery": 62.0, "data_gap_recovery": 13.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 7.0, "grounded_synthesis": 15.0, "inconsistent_api_recovery": 26.0, "relevance_detection_stateful": 17.0, "argument_fidelity_stateful": 77.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 34.0, "conditional_routing_stateful": 37.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 38.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 4.0, "inconsistent_api_recovery_stateful": 15.0}, "scenarioWastedN": {"relevance_detection": 7, "argument_fidelity": 7, "tool_selection": 1, "basic_2step": 10, "sequential_3step": 21, "conditional_routing": 13, "sequential_reasoning": 12, "error_recovery": 13, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 6, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 8, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 25, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 7}, "scenarioSpeedSum": {"relevance_detection": 15.16, "argument_fidelity": 46.53, "tool_selection": 7.07, "basic_2step": 19.31, "sequential_3step": 148.65, "conditional_routing": 161.98, "sequential_reasoning": 74.57, "error_recovery": 75.37, "data_gap_recovery": 43.5, "data_gap_recovery_extended": 0.0, "argument_transformation": 20.95, "grounded_synthesis": 135.52, "inconsistent_api_recovery": 195.56, "relevance_detection_stateful": 19.46, "argument_fidelity_stateful": 66.19, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 187.68, "conditional_routing_stateful": 104.46, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 60.51, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 102.09, "inconsistent_api_recovery_stateful": 164.64}, "scenarioSpeedN": {"relevance_detection": 7, "argument_fidelity": 7, "tool_selection": 1, "basic_2step": 10, "sequential_3step": 21, "conditional_routing": 13, "sequential_reasoning": 12, "error_recovery": 13, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 6, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 8, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 25, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 7}}, {"label": "llama3.1:8b-instruct-q4_K_M OL/N [reforged]", "model": "llama3.1:8b-instruct-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 4.7, "accuracy": 7.2, "completeness": 64.8, "efficiency": 45.1, "wasted": 3.2, "speed": 3.7, "n": 50, "scenarios": {"relevance_detection": 8, "argument_fidelity": 0, "tool_selection": 2, "basic_2step": 34, "sequential_3step": 4, "conditional_routing": 0, "sequential_reasoning": 6, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 12, "argument_fidelity_stateful": 8, "tool_selection_stateful": 6, "basic_2step_stateful": 14, "sequential_3step_stateful": 6, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 6, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 4, "argument_fidelity": 0, "tool_selection": 1, "basic_2step": 17, "sequential_3step": 2, "conditional_routing": 0, "sequential_reasoning": 3, "error_recovery": 2, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 4, "tool_selection_stateful": 3, "basic_2step_stateful": 7, "sequential_3step_stateful": 3, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 3, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 43, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 32, "sequential_reasoning": 30, "error_recovery": 18, "data_gap_recovery": 33, "data_gap_recovery_extended": 26, "argument_transformation": 29, "grounded_synthesis": 48, "inconsistent_api_recovery": 18, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 30, "tool_selection_stateful": 37, "basic_2step_stateful": 49, "sequential_3step_stateful": 42, "conditional_routing_stateful": 22, "sequential_reasoning_stateful": 6, "error_recovery_stateful": 16, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 19, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 16}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 43, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 32, "sequential_reasoning": 30, "error_recovery": 18, "data_gap_recovery": 33, "data_gap_recovery_extended": 26, "argument_transformation": 29, "grounded_synthesis": 48, "inconsistent_api_recovery": 18, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 30, "tool_selection_stateful": 37, "basic_2step_stateful": 49, "sequential_3step_stateful": 42, "conditional_routing_stateful": 22, "sequential_reasoning_stateful": 6, "error_recovery_stateful": 16, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 19, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 16}, "scenarioIdealCalls": {"relevance_detection": 4, "argument_fidelity": 0, "tool_selection": 3, "basic_2step": 34, "sequential_3step": 6, "conditional_routing": 0, "sequential_reasoning": 12, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 12, "tool_selection_stateful": 9, "basic_2step_stateful": 14, "sequential_3step_stateful": 9, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 11, "argument_fidelity": 0, "tool_selection": 6, "basic_2step": 51, "sequential_3step": 17, "conditional_routing": 0, "sequential_reasoning": 28, "error_recovery": 12, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 12, "argument_fidelity_stateful": 42, "tool_selection_stateful": 26, "basic_2step_stateful": 22, "sequential_3step_stateful": 15, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 21, "data_gap_recovery_stateful": 25, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 38}, "scenarioWastedSum": {"relevance_detection": 7.0, "argument_fidelity": 246.0, "tool_selection": 57.0, "basic_2step": 67.0, "sequential_3step": 145.0, "conditional_routing": 126.0, "sequential_reasoning": 217.0, "error_recovery": 90.0, "data_gap_recovery": 165.0, "data_gap_recovery_extended": 111.0, "argument_transformation": 49.0, "grounded_synthesis": 47.0, "inconsistent_api_recovery": 139.0, "relevance_detection_stateful": 16.0, "argument_fidelity_stateful": 191.0, "tool_selection_stateful": 174.0, "basic_2step_stateful": 67.0, "sequential_3step_stateful": 149.0, "conditional_routing_stateful": 114.0, "sequential_reasoning_stateful": 56.0, "error_recovery_stateful": 72.0, "data_gap_recovery_stateful": 61.0, "data_gap_recovery_extended_stateful": 104.0, "argument_transformation_stateful": 49.0, "grounded_synthesis_stateful": 42.0, "inconsistent_api_recovery_stateful": 141.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 43, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 32, "sequential_reasoning": 30, "error_recovery": 18, "data_gap_recovery": 33, "data_gap_recovery_extended": 26, "argument_transformation": 29, "grounded_synthesis": 48, "inconsistent_api_recovery": 18, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 30, "tool_selection_stateful": 37, "basic_2step_stateful": 49, "sequential_3step_stateful": 42, "conditional_routing_stateful": 22, "sequential_reasoning_stateful": 6, "error_recovery_stateful": 16, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 19, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 16}, "scenarioSpeedSum": {"relevance_detection": 33.41, "argument_fidelity": 150.41, "tool_selection": 103.05, "basic_2step": 48.53, "sequential_3step": 216.58, "conditional_routing": 109.19, "sequential_reasoning": 124.93, "error_recovery": 90.69, "data_gap_recovery": 136.57, "data_gap_recovery_extended": 164.42, "argument_transformation": 164.24, "grounded_synthesis": 289.69, "inconsistent_api_recovery": 162.33, "relevance_detection_stateful": 34.4, "argument_fidelity_stateful": 111.24, "tool_selection_stateful": 122.42, "basic_2step_stateful": 39.97, "sequential_3step_stateful": 148.81, "conditional_routing_stateful": 80.33, "sequential_reasoning_stateful": 27.92, "error_recovery_stateful": 68.91, "data_gap_recovery_stateful": 64.2, "data_gap_recovery_extended_stateful": 136.16, "argument_transformation_stateful": 101.58, "grounded_synthesis_stateful": 252.92, "inconsistent_api_recovery_stateful": 145.25}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 43, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 32, "sequential_reasoning": 30, "error_recovery": 18, "data_gap_recovery": 33, "data_gap_recovery_extended": 26, "argument_transformation": 29, "grounded_synthesis": 48, "inconsistent_api_recovery": 18, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 30, "tool_selection_stateful": 37, "basic_2step_stateful": 49, "sequential_3step_stateful": 42, "conditional_routing_stateful": 22, "sequential_reasoning_stateful": 6, "error_recovery_stateful": 16, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 19, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 16}}, {"label": "Mistral-Nemo-Instruct-2407-Q4_K_M LS/N [reforged]", "model": "Mistral-Nemo-Instruct-2407-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 5.3, "accuracy": 100.0, "completeness": 5.3, "efficiency": 40.8, "wasted": 1.4, "speed": 1.1, "n": 50, "scenarios": {"relevance_detection": 68, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 70, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 34, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 34, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 34, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 34, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 84, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 85, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 34, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 39.35, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 38.07, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 34, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}}, {"label": "llama3.1:8b-instruct-q8_0 OL/N [reforged]", "model": "llama3.1:8b-instruct-q8_0", "backend": "ollama", "mode": "native", "ablation": "reforged", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 4.3, "accuracy": 8.2, "completeness": 52.5, "efficiency": 42.7, "wasted": 2.9, "speed": 4.9, "n": 50, "scenarios": {"relevance_detection": 8, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 20, "sequential_3step": 14, "conditional_routing": 0, "sequential_reasoning": 22, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 4, "tool_selection_stateful": 12, "basic_2step_stateful": 10, "sequential_3step_stateful": 8, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 6, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 4, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 10, "sequential_3step": 7, "conditional_routing": 0, "sequential_reasoning": 11, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 3, "argument_fidelity_stateful": 2, "tool_selection_stateful": 6, "basic_2step_stateful": 5, "sequential_3step_stateful": 4, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 3, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 21, "tool_selection": 43, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 20, "sequential_reasoning": 34, "error_recovery": 5, "data_gap_recovery": 33, "data_gap_recovery_extended": 16, "argument_transformation": 8, "grounded_synthesis": 48, "inconsistent_api_recovery": 9, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 21, "tool_selection_stateful": 24, "basic_2step_stateful": 50, "sequential_3step_stateful": 42, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 8, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 4}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 21, "tool_selection": 43, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 20, "sequential_reasoning": 34, "error_recovery": 5, "data_gap_recovery": 33, "data_gap_recovery_extended": 16, "argument_transformation": 8, "grounded_synthesis": 48, "inconsistent_api_recovery": 9, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 21, "tool_selection_stateful": 24, "basic_2step_stateful": 50, "sequential_3step_stateful": 42, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 8, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 4}, "scenarioIdealCalls": {"relevance_detection": 4, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 20, "sequential_3step": 21, "conditional_routing": 0, "sequential_reasoning": 44, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 3, "argument_fidelity_stateful": 6, "tool_selection_stateful": 18, "basic_2step_stateful": 10, "sequential_3step_stateful": 12, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 8, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 31, "sequential_3step": 63, "conditional_routing": 0, "sequential_reasoning": 112, "error_recovery": 0, "data_gap_recovery": 12, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 16, "tool_selection_stateful": 56, "basic_2step_stateful": 15, "sequential_3step_stateful": 22, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 15, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 5.0, "argument_fidelity": 157.0, "tool_selection": 57.0, "basic_2step": 85.0, "sequential_3step": 159.0, "conditional_routing": 52.0, "sequential_reasoning": 216.0, "error_recovery": 30.0, "data_gap_recovery": 123.0, "data_gap_recovery_extended": 91.0, "argument_transformation": 16.0, "grounded_synthesis": 77.0, "inconsistent_api_recovery": 91.0, "relevance_detection_stateful": 7.0, "argument_fidelity_stateful": 127.0, "tool_selection_stateful": 131.0, "basic_2step_stateful": 82.0, "sequential_3step_stateful": 130.0, "conditional_routing_stateful": 39.0, "sequential_reasoning_stateful": 74.0, "error_recovery_stateful": 28.0, "data_gap_recovery_stateful": 66.0, "data_gap_recovery_extended_stateful": 44.0, "argument_transformation_stateful": 35.0, "grounded_synthesis_stateful": 27.0, "inconsistent_api_recovery_stateful": 34.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 21, "tool_selection": 43, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 20, "sequential_reasoning": 34, "error_recovery": 5, "data_gap_recovery": 33, "data_gap_recovery_extended": 16, "argument_transformation": 8, "grounded_synthesis": 48, "inconsistent_api_recovery": 9, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 21, "tool_selection_stateful": 24, "basic_2step_stateful": 50, "sequential_3step_stateful": 42, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 8, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 4}, "scenarioSpeedSum": {"relevance_detection": 43.59, "argument_fidelity": 120.73, "tool_selection": 143.45, "basic_2step": 82.63, "sequential_3step": 315.07, "conditional_routing": 103.27, "sequential_reasoning": 205.43, "error_recovery": 37.73, "data_gap_recovery": 174.04, "data_gap_recovery_extended": 161.48, "argument_transformation": 72.96, "grounded_synthesis": 408.76, "inconsistent_api_recovery": 114.74, "relevance_detection_stateful": 48.6, "argument_fidelity_stateful": 109.38, "tool_selection_stateful": 118.69, "basic_2step_stateful": 63.48, "sequential_3step_stateful": 235.0, "conditional_routing_stateful": 49.48, "sequential_reasoning_stateful": 53.81, "error_recovery_stateful": 55.3, "data_gap_recovery_stateful": 86.88, "data_gap_recovery_extended_stateful": 82.56, "argument_transformation_stateful": 78.22, "grounded_synthesis_stateful": 326.05, "inconsistent_api_recovery_stateful": 55.58}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 21, "tool_selection": 43, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 20, "sequential_reasoning": 34, "error_recovery": 5, "data_gap_recovery": 33, "data_gap_recovery_extended": 16, "argument_transformation": 8, "grounded_synthesis": 48, "inconsistent_api_recovery": 9, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 21, "tool_selection_stateful": 24, "basic_2step_stateful": 50, "sequential_3step_stateful": 42, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 8, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 4}}, {"label": "mistral:7b-instruct-v0.3-q8_0 OL/N [bare]", "model": "mistral:7b-instruct-v0.3-q8_0", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 2.6, "accuracy": 11.4, "completeness": 22.9, "efficiency": 100.0, "wasted": 0.0, "speed": 1.4, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 68, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 34, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 44, "tool_selection": 11, "basic_2step": 44, "sequential_3step": 37, "conditional_routing": 1, "sequential_reasoning": 7, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 7, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 37, "tool_selection_stateful": 10, "basic_2step_stateful": 48, "sequential_3step_stateful": 34, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 4, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 6}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 44, "tool_selection": 11, "basic_2step": 44, "sequential_3step": 37, "conditional_routing": 1, "sequential_reasoning": 7, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 7, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 37, "tool_selection_stateful": 10, "basic_2step_stateful": 48, "sequential_3step_stateful": 34, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 4, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 6}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 68, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 34, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 44, "tool_selection": 11, "basic_2step": 44, "sequential_3step": 37, "conditional_routing": 1, "sequential_reasoning": 7, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 7, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 37, "tool_selection_stateful": 10, "basic_2step_stateful": 48, "sequential_3step_stateful": 34, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 4, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 6}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 55.81, "tool_selection": 14.95, "basic_2step": 30.51, "sequential_3step": 66.98, "conditional_routing": 4.25, "sequential_reasoning": 18.25, "error_recovery": 4.08, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 19.97, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 40.79, "tool_selection_stateful": 13.16, "basic_2step_stateful": 32.99, "sequential_3step_stateful": 68.75, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 11.09, "error_recovery_stateful": 5.63, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 17.85}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 44, "tool_selection": 11, "basic_2step": 44, "sequential_3step": 37, "conditional_routing": 1, "sequential_reasoning": 7, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 7, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 37, "tool_selection_stateful": 10, "basic_2step_stateful": 48, "sequential_3step_stateful": 34, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 4, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 6}}, {"label": "mistral:7b-instruct-v0.3-q4_K_M OL/N [bare]", "model": "mistral:7b-instruct-v0.3-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 2.7, "accuracy": 12.0, "completeness": 22.5, "efficiency": 100.0, "wasted": 0.0, "speed": 1.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 70, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 35, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 36, "tool_selection": 17, "basic_2step": 38, "sequential_3step": 27, "conditional_routing": 1, "sequential_reasoning": 4, "error_recovery": 12, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 2, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 31, "tool_selection_stateful": 14, "basic_2step_stateful": 43, "sequential_3step_stateful": 34, "conditional_routing_stateful": 1, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 11, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 9}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 36, "tool_selection": 17, "basic_2step": 38, "sequential_3step": 27, "conditional_routing": 1, "sequential_reasoning": 4, "error_recovery": 12, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 2, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 31, "tool_selection_stateful": 14, "basic_2step_stateful": 43, "sequential_3step_stateful": 34, "conditional_routing_stateful": 1, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 11, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 9}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 70, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 35, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 36, "tool_selection": 17, "basic_2step": 38, "sequential_3step": 27, "conditional_routing": 1, "sequential_reasoning": 4, "error_recovery": 12, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 2, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 31, "tool_selection_stateful": 14, "basic_2step_stateful": 43, "sequential_3step_stateful": 34, "conditional_routing_stateful": 1, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 11, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 9}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 26.47, "tool_selection": 14.46, "basic_2step": 18.61, "sequential_3step": 33.73, "conditional_routing": 3.14, "sequential_reasoning": 8.16, "error_recovery": 9.7, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 3.13, "grounded_synthesis": 8.19, "inconsistent_api_recovery": 18.65, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 26.43, "tool_selection_stateful": 14.14, "basic_2step_stateful": 21.86, "sequential_3step_stateful": 48.71, "conditional_routing_stateful": 2.37, "sequential_reasoning_stateful": 1.06, "error_recovery_stateful": 6.98, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 15.71}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 36, "tool_selection": 17, "basic_2step": 38, "sequential_3step": 27, "conditional_routing": 1, "sequential_reasoning": 4, "error_recovery": 12, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 2, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 31, "tool_selection_stateful": 14, "basic_2step_stateful": 43, "sequential_3step_stateful": 34, "conditional_routing_stateful": 1, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 11, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 9}}, {"label": "llama3.1:8b-instruct-q4_K_M OL/N [bare]", "model": "llama3.1:8b-instruct-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 0.5, "accuracy": 2.2, "completeness": 24.1, "efficiency": 100.0, "wasted": 0.0, "speed": 1.2, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 14, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 7, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 6, "tool_selection": 0, "basic_2step": 1, "sequential_3step": 3, "conditional_routing": 20, "sequential_reasoning": 0, "error_recovery": 30, "data_gap_recovery": 29, "data_gap_recovery_extended": 17, "argument_transformation": 16, "grounded_synthesis": 32, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 5, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 8, "conditional_routing_stateful": 17, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 28, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 4}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 6, "tool_selection": 0, "basic_2step": 1, "sequential_3step": 3, "conditional_routing": 20, "sequential_reasoning": 0, "error_recovery": 30, "data_gap_recovery": 29, "data_gap_recovery_extended": 17, "argument_transformation": 16, "grounded_synthesis": 32, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 5, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 8, "conditional_routing_stateful": 17, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 28, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 4}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 14, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 7, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 6, "tool_selection": 0, "basic_2step": 1, "sequential_3step": 3, "conditional_routing": 20, "sequential_reasoning": 0, "error_recovery": 30, "data_gap_recovery": 29, "data_gap_recovery_extended": 17, "argument_transformation": 16, "grounded_synthesis": 32, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 5, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 8, "conditional_routing_stateful": 17, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 28, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 4}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 7.62, "tool_selection": 0.0, "basic_2step": 0.51, "sequential_3step": 1.95, "conditional_routing": 25.53, "sequential_reasoning": 0.0, "error_recovery": 18.11, "data_gap_recovery": 24.0, "data_gap_recovery_extended": 18.25, "argument_transformation": 37.71, "grounded_synthesis": 58.91, "inconsistent_api_recovery": 3.04, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 3.23, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 6.24, "conditional_routing_stateful": 12.73, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 19.1, "data_gap_recovery_stateful": 25.59, "data_gap_recovery_extended_stateful": 23.59, "argument_transformation_stateful": 25.13, "grounded_synthesis_stateful": 59.69, "inconsistent_api_recovery_stateful": 6.8}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 6, "tool_selection": 0, "basic_2step": 1, "sequential_3step": 3, "conditional_routing": 20, "sequential_reasoning": 0, "error_recovery": 30, "data_gap_recovery": 29, "data_gap_recovery_extended": 17, "argument_transformation": 16, "grounded_synthesis": 32, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 5, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 8, "conditional_routing_stateful": 17, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 28, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 4}}, {"label": "llama3.1:8b-instruct-q8_0 OL/N [bare]", "model": "llama3.1:8b-instruct-q8_0", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 0.2, "accuracy": 0.5, "completeness": 29.4, "efficiency": 100.0, "wasted": 0.0, "speed": 1.6, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 2, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 3, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 15, "conditional_routing": 20, "sequential_reasoning": 1, "error_recovery": 41, "data_gap_recovery": 32, "data_gap_recovery_extended": 15, "argument_transformation": 7, "grounded_synthesis": 40, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 2, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 15, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 15, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 19}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 3, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 15, "conditional_routing": 20, "sequential_reasoning": 1, "error_recovery": 41, "data_gap_recovery": 32, "data_gap_recovery_extended": 15, "argument_transformation": 7, "grounded_synthesis": 40, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 2, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 15, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 15, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 19}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 2, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 3, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 15, "conditional_routing": 20, "sequential_reasoning": 1, "error_recovery": 41, "data_gap_recovery": 32, "data_gap_recovery_extended": 15, "argument_transformation": 7, "grounded_synthesis": 40, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 2, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 15, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 15, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 19}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 4.89, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 15.55, "conditional_routing": 30.56, "sequential_reasoning": 1.51, "error_recovery": 28.58, "data_gap_recovery": 42.56, "data_gap_recovery_extended": 50.7, "argument_transformation": 23.87, "grounded_synthesis": 78.96, "inconsistent_api_recovery": 43.42, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 3.45, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 14.21, "conditional_routing_stateful": 37.91, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 32.3, "data_gap_recovery_stateful": 26.8, "data_gap_recovery_extended_stateful": 26.64, "argument_transformation_stateful": 6.75, "grounded_synthesis_stateful": 87.16, "inconsistent_api_recovery_stateful": 43.5}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 3, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 15, "conditional_routing": 20, "sequential_reasoning": 1, "error_recovery": 41, "data_gap_recovery": 32, "data_gap_recovery_extended": 15, "argument_transformation": 7, "grounded_synthesis": 40, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 2, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 15, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 15, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 19}}, {"label": "Mistral-7B-Instruct-v0.3-Q4_K_M LS/N [reforged]", "model": "Mistral-7B-Instruct-v0.3-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 0.0, "accuracy": null, "completeness": 0.0, "efficiency": 0.0, "wasted": 0.0, "speed": 0.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}}, {"label": "Mistral-7B-Instruct-v0.3-Q4_K_M LS/N [bare]", "model": "Mistral-7B-Instruct-v0.3-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 0.0, "accuracy": null, "completeness": 0.0, "efficiency": 0.0, "wasted": 0.0, "speed": 0.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}}, {"label": "Mistral-7B-Instruct-v0.3-Q8_0 LS/N [bare]", "model": "Mistral-7B-Instruct-v0.3-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 0.0, "accuracy": null, "completeness": 0.0, "efficiency": 0.0, "wasted": 0.0, "speed": 0.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}}, {"label": "mistral-nemo:12b-instruct-2407-q4_K_M OL/N [bare]", "model": "mistral-nemo:12b-instruct-2407-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 0.0, "accuracy": null, "completeness": 0.0, "efficiency": 0.0, "wasted": 0.0, "speed": 0.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}}, {"label": "Mistral-Nemo-Instruct-2407-Q4_K_M LS/N [bare]", "model": "Mistral-Nemo-Instruct-2407-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 0.0, "accuracy": null, "completeness": 0.0, "efficiency": 0.0, "wasted": 0.0, "speed": 0.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}}, {"label": "Mistral-7B-Instruct-v0.3-Q8_0 LS/N [reforged]", "model": "Mistral-7B-Instruct-v0.3-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 0.0, "accuracy": null, "completeness": 0.0, "efficiency": 0.0, "wasted": 0.0, "speed": 0.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}}], "scenarios": ["relevance_detection", "argument_fidelity", "tool_selection", "basic_2step", "sequential_3step", "conditional_routing", "sequential_reasoning", "error_recovery", "data_gap_recovery", "data_gap_recovery_extended", "argument_transformation", "grounded_synthesis", "inconsistent_api_recovery", "relevance_detection_stateful", "argument_fidelity_stateful", "tool_selection_stateful", "basic_2step_stateful", "sequential_3step_stateful", "conditional_routing_stateful", "sequential_reasoning_stateful", "error_recovery_stateful", "data_gap_recovery_stateful", "data_gap_recovery_extended_stateful", "argument_transformation_stateful", "grounded_synthesis_stateful", "inconsistent_api_recovery_stateful"], "scenarioAbbrev": {"relevance_detection": "rel", "argument_fidelity": "arg", "tool_selection": "tsl", "basic_2step": "b2s", "sequential_3step": "s3s", "conditional_routing": "crt", "sequential_reasoning": "srn", "error_recovery": "err", "data_gap_recovery": "dgr", "data_gap_recovery_extended": "dge", "argument_transformation": "art", "grounded_synthesis": "grs", "inconsistent_api_recovery": "iar", "relevance_detection_stateful": "rel_s", "argument_fidelity_stateful": "arg_s", "tool_selection_stateful": "tsl_s", "basic_2step_stateful": "b2s_s", "sequential_3step_stateful": "s3s_s", "conditional_routing_stateful": "crt_s", "sequential_reasoning_stateful": "srn_s", "error_recovery_stateful": "err_s", "data_gap_recovery_stateful": "dgr_s", "data_gap_recovery_extended_stateful": "dge_s", "argument_transformation_stateful": "art_s", "grounded_synthesis_stateful": "grs_s", "inconsistent_api_recovery_stateful": "iar_s"}, "scenarioSuite": {"relevance_detection": "og18", "argument_fidelity": "og18", "tool_selection": "og18", "basic_2step": "og18", "sequential_3step": "og18", "conditional_routing": "og18", "sequential_reasoning": "og18", "error_recovery": "og18", "data_gap_recovery": "og18", "data_gap_recovery_extended": "advanced_reasoning", "argument_transformation": "advanced_reasoning", "grounded_synthesis": "advanced_reasoning", "inconsistent_api_recovery": "advanced_reasoning", "relevance_detection_stateful": "og18", "argument_fidelity_stateful": "og18", "tool_selection_stateful": "og18", "basic_2step_stateful": "og18", "sequential_3step_stateful": "og18", "conditional_routing_stateful": "og18", "sequential_reasoning_stateful": "og18", "error_recovery_stateful": "og18", "data_gap_recovery_stateful": "og18", "data_gap_recovery_extended_stateful": "advanced_reasoning", "argument_transformation_stateful": "advanced_reasoning", "grounded_synthesis_stateful": "advanced_reasoning", "inconsistent_api_recovery_stateful": "advanced_reasoning"}, "maxGen": 2, "genInfo": {"1": {"commit": "2b05dc4", "date": "2026-05-08", "note": "v0.6.0 suite \u2014 incl. Anthropic ablation"}, "2": {"commit": "655e1f6", "date": "2026-05-22", "note": "v0.7.0 lineup refresh (8\u201314B) + 32GB tier debut (v0.7.4)"}}, "timestamp": "2026-06-03 00:09"};</script>
+  <script>window.__FORGE_DATA__ = {"rows": [{"label": "claude-opus-4-8 AN/N [reforged]", "model": "claude-opus-4-8", "backend": "anthropic", "mode": "native", "ablation": "reforged", "replay": "none", "family": "claude", "quant": "n/a", "gen": 3, "retired": false, "score": 100.0, "accuracy": 100.0, "completeness": 100.0, "efficiency": 100.0, "wasted": 0.0, "speed": 13.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 100, "argument_transformation": 100, "grounded_synthesis": 100, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 100, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 400, "argument_transformation": 250, "grounded_synthesis": 500, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 250, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 51, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 100, "sequential_reasoning": 200, "error_recovery": 150, "data_gap_recovery": 188, "data_gap_recovery_extended": 200, "argument_transformation": 150, "grounded_synthesis": 170, "inconsistent_api_recovery": 300, "relevance_detection_stateful": 58, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 176, "data_gap_recovery_extended_stateful": 200, "argument_transformation_stateful": 150, "grounded_synthesis_stateful": 171, "inconsistent_api_recovery_stateful": 300}, "scenarioWastedSum": {"relevance_detection": 1.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 8.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 140.97, "argument_fidelity": 376.21, "tool_selection": 317.93, "basic_2step": 180.23, "sequential_3step": 448.71, "conditional_routing": 777.65, "sequential_reasoning": 499.74, "error_recovery": 291.19, "data_gap_recovery": 659.54, "data_gap_recovery_extended": 901.91, "argument_transformation": 1503.0, "grounded_synthesis": 1567.05, "inconsistent_api_recovery": 1054.74, "relevance_detection_stateful": 180.41, "argument_fidelity_stateful": 347.08, "tool_selection_stateful": 334.4, "basic_2step_stateful": 220.28, "sequential_3step_stateful": 431.18, "conditional_routing_stateful": 780.02, "sequential_reasoning_stateful": 466.59, "error_recovery_stateful": 282.12, "data_gap_recovery_stateful": 622.24, "data_gap_recovery_extended_stateful": 929.81, "argument_transformation_stateful": 1496.25, "grounded_synthesis_stateful": 1564.76, "inconsistent_api_recovery_stateful": 975.63}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-sonnet-4-6 AN/N [reforged]", "model": "claude-sonnet-4-6", "backend": "anthropic", "mode": "native", "ablation": "reforged", "replay": "none", "family": "claude", "quant": "n/a", "gen": 3, "retired": false, "score": 100.0, "accuracy": 100.0, "completeness": 100.0, "efficiency": 100.0, "wasted": 0.0, "speed": 18.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 100, "argument_transformation": 100, "grounded_synthesis": 100, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 100, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 400, "argument_transformation": 250, "grounded_synthesis": 500, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 250, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 101, "sequential_reasoning": 200, "error_recovery": 150, "data_gap_recovery": 150, "data_gap_recovery_extended": 179, "argument_transformation": 150, "grounded_synthesis": 150, "inconsistent_api_recovery": 300, "relevance_detection_stateful": 52, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 103, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 168, "argument_transformation_stateful": 150, "grounded_synthesis_stateful": 154, "inconsistent_api_recovery_stateful": 300}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 2.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 180.7, "argument_fidelity": 371.73, "tool_selection": 370.36, "basic_2step": 241.05, "sequential_3step": 352.02, "conditional_routing": 726.62, "sequential_reasoning": 607.08, "error_recovery": 380.51, "data_gap_recovery": 879.6, "data_gap_recovery_extended": 1195.42, "argument_transformation": 2174.46, "grounded_synthesis": 2425.66, "inconsistent_api_recovery": 1743.59, "relevance_detection_stateful": 174.96, "argument_fidelity_stateful": 446.4, "tool_selection_stateful": 422.72, "basic_2step_stateful": 268.96, "sequential_3step_stateful": 421.64, "conditional_routing_stateful": 812.23, "sequential_reasoning_stateful": 663.87, "error_recovery_stateful": 415.6, "data_gap_recovery_stateful": 865.89, "data_gap_recovery_extended_stateful": 1238.26, "argument_transformation_stateful": 2104.74, "grounded_synthesis_stateful": 2527.38, "inconsistent_api_recovery_stateful": 1610.8}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-opus-4-6 AN/N [reforged:full]", "model": "claude-opus-4-6", "backend": "anthropic", "mode": "native", "ablation": "reforged", "replay": "full", "family": "claude", "quant": "n/a", "gen": 1, "retired": false, "score": 99.2, "accuracy": 99.8, "completeness": 99.4, "efficiency": 100.0, "wasted": 0.0, "speed": 15.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 98, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 100, "argument_transformation": 98, "grounded_synthesis": 94, "inconsistent_api_recovery": 98, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 96, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 98, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 98}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 47, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 196, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 400, "argument_transformation": 245, "grounded_synthesis": 470, "inconsistent_api_recovery": 392, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 245, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 392}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 98, "sequential_reasoning": 200, "error_recovery": 150, "data_gap_recovery": 150, "data_gap_recovery_extended": 150, "argument_transformation": 147, "grounded_synthesis": 141, "inconsistent_api_recovery": 349, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 151, "argument_transformation_stateful": 147, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 358}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 139.05, "argument_fidelity": 363.67, "tool_selection": 326.01, "basic_2step": 331.58, "sequential_3step": 446.21, "conditional_routing": 945.05, "sequential_reasoning": 464.91, "error_recovery": 308.01, "data_gap_recovery": 1029.22, "data_gap_recovery_extended": 1009.33, "argument_transformation": 1013.19, "grounded_synthesis": 1892.46, "inconsistent_api_recovery": 1518.48, "relevance_detection_stateful": 125.53, "argument_fidelity_stateful": 1096.25, "tool_selection_stateful": 454.47, "basic_2step_stateful": 226.33, "sequential_3step_stateful": 476.66, "conditional_routing_stateful": 701.36, "sequential_reasoning_stateful": 851.35, "error_recovery_stateful": 874.25, "data_gap_recovery_stateful": 639.94, "data_gap_recovery_extended_stateful": 891.99, "argument_transformation_stateful": 1022.41, "grounded_synthesis_stateful": 1857.26, "inconsistent_api_recovery_stateful": 1176.68}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "Qwen3.6-35B-A3B-UD-Q4_K_M LS/N [reforged:full]", "model": "Qwen3.6-35B-A3B-UD-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "qwen3.6-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 94.8, "accuracy": 95.1, "completeness": 99.7, "efficiency": 100.0, "wasted": 0.6, "speed": 12.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 96, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 72, "argument_transformation": 78, "grounded_synthesis": 92, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 98, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 68, "argument_transformation_stateful": 76, "grounded_synthesis_stateful": 94, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 48, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 36, "argument_transformation": 39, "grounded_synthesis": 46, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 38, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 192, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 288, "argument_transformation": 195, "grounded_synthesis": 460, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 272, "argument_transformation_stateful": 190, "grounded_synthesis_stateful": 470, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 94, "argument_fidelity": 164, "tool_selection": 214, "basic_2step": 107, "sequential_3step": 186, "conditional_routing": 139, "sequential_reasoning": 249, "error_recovery": 167, "data_gap_recovery": 172, "data_gap_recovery_extended": 152, "argument_transformation": 186, "grounded_synthesis": 390, "inconsistent_api_recovery": 372, "relevance_detection_stateful": 88, "argument_fidelity_stateful": 168, "tool_selection_stateful": 222, "basic_2step_stateful": 106, "sequential_3step_stateful": 192, "conditional_routing_stateful": 146, "sequential_reasoning_stateful": 222, "error_recovery_stateful": 158, "data_gap_recovery_stateful": 178, "data_gap_recovery_extended_stateful": 152, "argument_transformation_stateful": 188, "grounded_synthesis_stateful": 398, "inconsistent_api_recovery_stateful": 385}, "scenarioWastedSum": {"relevance_detection": 44.0, "argument_fidelity": 14.0, "tool_selection": 64.0, "basic_2step": 7.0, "sequential_3step": 36.0, "conditional_routing": 7.0, "sequential_reasoning": 59.0, "error_recovery": 67.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 19.0, "grounded_synthesis": 110.0, "inconsistent_api_recovery": 12.0, "relevance_detection_stateful": 39.0, "argument_fidelity_stateful": 18.0, "tool_selection_stateful": 72.0, "basic_2step_stateful": 6.0, "sequential_3step_stateful": 42.0, "conditional_routing_stateful": 12.0, "sequential_reasoning_stateful": 41.0, "error_recovery_stateful": 8.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 27.0, "grounded_synthesis_stateful": 96.0, "inconsistent_api_recovery_stateful": 20.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 508.07, "argument_fidelity": 198.63, "tool_selection": 433.45, "basic_2step": 141.72, "sequential_3step": 273.79, "conditional_routing": 470.21, "sequential_reasoning": 420.81, "error_recovery": 620.19, "data_gap_recovery": 518.6, "data_gap_recovery_extended": 540.97, "argument_transformation": 1808.06, "grounded_synthesis": 1506.48, "inconsistent_api_recovery": 990.66, "relevance_detection_stateful": 499.19, "argument_fidelity_stateful": 202.09, "tool_selection_stateful": 381.19, "basic_2step_stateful": 138.95, "sequential_3step_stateful": 276.05, "conditional_routing_stateful": 540.46, "sequential_reasoning_stateful": 370.69, "error_recovery_stateful": 355.01, "data_gap_recovery_stateful": 505.92, "data_gap_recovery_extended_stateful": 570.54, "argument_transformation_stateful": 1824.15, "grounded_synthesis_stateful": 1422.5, "inconsistent_api_recovery_stateful": 952.6}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-haiku-4-5-20251001 AN/N [reforged]", "model": "claude-haiku-4-5-20251001", "backend": "anthropic", "mode": "native", "ablation": "reforged", "replay": "none", "family": "claude", "quant": "n/a", "gen": 3, "retired": false, "score": 94.2, "accuracy": 94.2, "completeness": 99.9, "efficiency": 100.0, "wasted": 0.3, "speed": 6.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 98, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 74, "argument_transformation": 74, "grounded_synthesis": 98, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 72, "argument_transformation_stateful": 38, "grounded_synthesis_stateful": 94, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 37, "argument_transformation": 37, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 196, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 296, "argument_transformation": 185, "grounded_synthesis": 490, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 288, "argument_transformation_stateful": 95, "grounded_synthesis_stateful": 470, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 103, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 152, "sequential_3step": 150, "conditional_routing": 100, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 201, "data_gap_recovery_extended": 184, "argument_transformation": 123, "grounded_synthesis": 155, "inconsistent_api_recovery": 357, "relevance_detection_stateful": 102, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 153, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 255, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 201, "data_gap_recovery_extended_stateful": 178, "argument_transformation_stateful": 65, "grounded_synthesis_stateful": 156, "inconsistent_api_recovery_stateful": 363}, "scenarioWastedSum": {"relevance_detection": 53.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 52.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 54.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 52.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 53.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 55.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 2.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 169.82, "argument_fidelity": 160.45, "tool_selection": 158.08, "basic_2step": 183.6, "sequential_3step": 191.38, "conditional_routing": 307.75, "sequential_reasoning": 373.51, "error_recovery": 148.96, "data_gap_recovery": 404.0, "data_gap_recovery_extended": 494.54, "argument_transformation": 424.22, "grounded_synthesis": 728.07, "inconsistent_api_recovery": 495.8, "relevance_detection_stateful": 174.15, "argument_fidelity_stateful": 160.91, "tool_selection_stateful": 170.8, "basic_2step_stateful": 198.37, "sequential_3step_stateful": 200.6, "conditional_routing_stateful": 317.82, "sequential_reasoning_stateful": 408.08, "error_recovery_stateful": 157.35, "data_gap_recovery_stateful": 407.33, "data_gap_recovery_extended_stateful": 488.54, "argument_transformation_stateful": 428.08, "grounded_synthesis_stateful": 682.16, "inconsistent_api_recovery_stateful": 533.97}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-27B-Q4_K_M LS/N [reforged:full]", "model": "Qwen3.5-27B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "qwen3.5-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 93.2, "accuracy": 93.3, "completeness": 99.8, "efficiency": 82.3, "wasted": 1.4, "speed": 37.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 98, "data_gap_recovery": 100, "data_gap_recovery_extended": 74, "argument_transformation": 38, "grounded_synthesis": 88, "inconsistent_api_recovery": 98, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 78, "argument_transformation_stateful": 56, "grounded_synthesis_stateful": 96, "inconsistent_api_recovery_stateful": 98}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 37, "argument_transformation": 19, "grounded_synthesis": 44, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 28, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 49}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 98, "data_gap_recovery": 250, "data_gap_recovery_extended": 296, "argument_transformation": 95, "grounded_synthesis": 440, "inconsistent_api_recovery": 392, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 312, "argument_transformation_stateful": 140, "grounded_synthesis_stateful": 480, "inconsistent_api_recovery_stateful": 392}, "scenarioActualCalls": {"relevance_detection": 53, "argument_fidelity": 254, "tool_selection": 191, "basic_2step": 135, "sequential_3step": 219, "conditional_routing": 255, "sequential_reasoning": 309, "error_recovery": 201, "data_gap_recovery": 310, "data_gap_recovery_extended": 248, "argument_transformation": 112, "grounded_synthesis": 238, "inconsistent_api_recovery": 608, "relevance_detection_stateful": 54, "argument_fidelity_stateful": 266, "tool_selection_stateful": 204, "basic_2step_stateful": 133, "sequential_3step_stateful": 234, "conditional_routing_stateful": 242, "sequential_reasoning_stateful": 315, "error_recovery_stateful": 220, "data_gap_recovery_stateful": 316, "data_gap_recovery_extended_stateful": 270, "argument_transformation_stateful": 172, "grounded_synthesis_stateful": 285, "inconsistent_api_recovery_stateful": 583}, "scenarioWastedSum": {"relevance_detection": 3.0, "argument_fidelity": 104.0, "tool_selection": 41.0, "basic_2step": 35.0, "sequential_3step": 69.0, "conditional_routing": 89.0, "sequential_reasoning": 109.0, "error_recovery": 109.0, "data_gap_recovery": 73.0, "data_gap_recovery_extended": 17.0, "argument_transformation": 45.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 216.0, "relevance_detection_stateful": 4.0, "argument_fidelity_stateful": 116.0, "tool_selection_stateful": 54.0, "basic_2step_stateful": 33.0, "sequential_3step_stateful": 84.0, "conditional_routing_stateful": 97.0, "sequential_reasoning_stateful": 115.0, "error_recovery_stateful": 70.0, "data_gap_recovery_stateful": 79.0, "data_gap_recovery_extended_stateful": 11.0, "argument_transformation_stateful": 48.0, "grounded_synthesis_stateful": 6.0, "inconsistent_api_recovery_stateful": 191.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 279.82, "argument_fidelity": 772.38, "tool_selection": 460.93, "basic_2step": 311.2, "sequential_3step": 653.05, "conditional_routing": 1843.5, "sequential_reasoning": 981.56, "error_recovery": 766.11, "data_gap_recovery": 2186.91, "data_gap_recovery_extended": 2653.6, "argument_transformation": 5412.58, "grounded_synthesis": 3722.24, "inconsistent_api_recovery": 4409.74, "relevance_detection_stateful": 256.35, "argument_fidelity_stateful": 789.72, "tool_selection_stateful": 472.2, "basic_2step_stateful": 317.74, "sequential_3step_stateful": 693.31, "conditional_routing_stateful": 1866.41, "sequential_reasoning_stateful": 948.52, "error_recovery_stateful": 704.44, "data_gap_recovery_stateful": 2160.24, "data_gap_recovery_extended_stateful": 2741.61, "argument_transformation_stateful": 5406.31, "grounded_synthesis_stateful": 3714.94, "inconsistent_api_recovery_stateful": 4325.4}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "Qwen3.6-27B-Q4_K_M LS/N [reforged:full]", "model": "Qwen3.6-27B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "qwen3.6-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 92.2, "accuracy": 92.5, "completeness": 99.6, "efficiency": 100.0, "wasted": 0.4, "speed": 37.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 98, "data_gap_recovery": 100, "data_gap_recovery_extended": 22, "argument_transformation": 74, "grounded_synthesis": 98, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 96, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 78, "grounded_synthesis_stateful": 96, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 11, "argument_transformation": 37, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 18, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 98, "data_gap_recovery": 250, "data_gap_recovery_extended": 88, "argument_transformation": 185, "grounded_synthesis": 490, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 144, "argument_transformation_stateful": 195, "grounded_synthesis_stateful": 480, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 66, "argument_fidelity": 155, "tool_selection": 161, "basic_2step": 108, "sequential_3step": 187, "conditional_routing": 190, "sequential_reasoning": 225, "error_recovery": 150, "data_gap_recovery": 186, "data_gap_recovery_extended": 48, "argument_transformation": 196, "grounded_synthesis": 194, "inconsistent_api_recovery": 419, "relevance_detection_stateful": 66, "argument_fidelity_stateful": 157, "tool_selection_stateful": 165, "basic_2step_stateful": 116, "sequential_3step_stateful": 183, "conditional_routing_stateful": 208, "sequential_reasoning_stateful": 223, "error_recovery_stateful": 146, "data_gap_recovery_stateful": 178, "data_gap_recovery_extended_stateful": 72, "argument_transformation_stateful": 197, "grounded_synthesis_stateful": 173, "inconsistent_api_recovery_stateful": 421}, "scenarioWastedSum": {"relevance_detection": 16.0, "argument_fidelity": 5.0, "tool_selection": 11.0, "basic_2step": 8.0, "sequential_3step": 37.0, "conditional_routing": 28.0, "sequential_reasoning": 25.0, "error_recovery": 52.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 32.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 36.0, "relevance_detection_stateful": 16.0, "argument_fidelity_stateful": 7.0, "tool_selection_stateful": 15.0, "basic_2step_stateful": 16.0, "sequential_3step_stateful": 33.0, "conditional_routing_stateful": 44.0, "sequential_reasoning_stateful": 23.0, "error_recovery_stateful": 2.0, "data_gap_recovery_stateful": 4.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 14.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 40.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 1708.49, "argument_fidelity": 577.79, "tool_selection": 455.19, "basic_2step": 362.6, "sequential_3step": 660.55, "conditional_routing": 1717.96, "sequential_reasoning": 1053.64, "error_recovery": 1021.89, "data_gap_recovery": 1994.37, "data_gap_recovery_extended": 1617.47, "argument_transformation": 5984.42, "grounded_synthesis": 3233.57, "inconsistent_api_recovery": 3652.94, "relevance_detection_stateful": 1602.1, "argument_fidelity_stateful": 580.54, "tool_selection_stateful": 435.14, "basic_2step_stateful": 600.39, "sequential_3step_stateful": 718.76, "conditional_routing_stateful": 1738.46, "sequential_reasoning_stateful": 1034.03, "error_recovery_stateful": 968.92, "data_gap_recovery_stateful": 1877.28, "data_gap_recovery_extended_stateful": 1731.45, "argument_transformation_stateful": 6709.6, "grounded_synthesis_stateful": 3147.67, "inconsistent_api_recovery_stateful": 3907.64}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-35B-A3B-Q4_K_M LS/N [reforged:full]", "model": "Qwen3.5-35B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "qwen3.5-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 92.1, "accuracy": 92.4, "completeness": 99.7, "efficiency": 82.1, "wasted": 1.3, "speed": 11.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 98, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 96, "argument_transformation": 14, "grounded_synthesis": 84, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 94, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 96, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 7, "grounded_synthesis": 42, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 196, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 384, "argument_transformation": 35, "grounded_synthesis": 420, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 376, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 480, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 61, "argument_fidelity": 246, "tool_selection": 244, "basic_2step": 172, "sequential_3step": 215, "conditional_routing": 298, "sequential_reasoning": 317, "error_recovery": 178, "data_gap_recovery": 290, "data_gap_recovery_extended": 320, "argument_transformation": 43, "grounded_synthesis": 311, "inconsistent_api_recovery": 477, "relevance_detection_stateful": 59, "argument_fidelity_stateful": 251, "tool_selection_stateful": 236, "basic_2step_stateful": 193, "sequential_3step_stateful": 243, "conditional_routing_stateful": 308, "sequential_reasoning_stateful": 335, "error_recovery_stateful": 175, "data_gap_recovery_stateful": 277, "data_gap_recovery_extended_stateful": 310, "argument_transformation_stateful": 63, "grounded_synthesis_stateful": 334, "inconsistent_api_recovery_stateful": 469}, "scenarioWastedSum": {"relevance_detection": 11.0, "argument_fidelity": 96.0, "tool_selection": 94.0, "basic_2step": 72.0, "sequential_3step": 65.0, "conditional_routing": 122.0, "sequential_reasoning": 121.0, "error_recovery": 78.0, "data_gap_recovery": 54.0, "data_gap_recovery_extended": 16.0, "argument_transformation": 32.0, "grounded_synthesis": 6.0, "inconsistent_api_recovery": 84.0, "relevance_detection_stateful": 9.0, "argument_fidelity_stateful": 101.0, "tool_selection_stateful": 86.0, "basic_2step_stateful": 93.0, "sequential_3step_stateful": 93.0, "conditional_routing_stateful": 140.0, "sequential_reasoning_stateful": 135.0, "error_recovery_stateful": 25.0, "data_gap_recovery_stateful": 50.0, "data_gap_recovery_extended_stateful": 22.0, "argument_transformation_stateful": 39.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 71.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 91.81, "argument_fidelity": 239.08, "tool_selection": 171.19, "basic_2step": 125.25, "sequential_3step": 211.5, "conditional_routing": 563.1, "sequential_reasoning": 329.57, "error_recovery": 185.75, "data_gap_recovery": 613.21, "data_gap_recovery_extended": 719.75, "argument_transformation": 1689.59, "grounded_synthesis": 1101.91, "inconsistent_api_recovery": 1167.15, "relevance_detection_stateful": 84.33, "argument_fidelity_stateful": 248.12, "tool_selection_stateful": 167.56, "basic_2step_stateful": 140.99, "sequential_3step_stateful": 241.01, "conditional_routing_stateful": 589.17, "sequential_reasoning_stateful": 349.54, "error_recovery_stateful": 185.09, "data_gap_recovery_stateful": 619.16, "data_gap_recovery_extended_stateful": 698.7, "argument_transformation_stateful": 1629.1, "grounded_synthesis_stateful": 1069.15, "inconsistent_api_recovery_stateful": 1150.96}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-opus-4-8 AN/N [bare]", "model": "claude-opus-4-8", "backend": "anthropic", "mode": "native", "ablation": "bare", "replay": "none", "family": "claude", "quant": "n/a", "gen": 3, "retired": false, "score": 88.0, "accuracy": 95.8, "completeness": 91.8, "efficiency": 100.0, "wasted": 0.0, "speed": 13.7, "n": 50, "scenarios": {"relevance_detection": 90, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 100, "argument_transformation": 100, "grounded_synthesis": 100, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 98, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 100, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 45, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 400, "argument_transformation": 250, "grounded_synthesis": 500, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 250, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 45, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 100, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 181, "data_gap_recovery_extended": 201, "argument_transformation": 151, "grounded_synthesis": 166, "inconsistent_api_recovery": 200, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 183, "data_gap_recovery_extended_stateful": 200, "argument_transformation_stateful": 150, "grounded_synthesis_stateful": 169, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 143.55, "argument_fidelity": 380.28, "tool_selection": 318.41, "basic_2step": 190.67, "sequential_3step": 419.58, "conditional_routing": 793.4, "sequential_reasoning": 464.13, "error_recovery": 0.0, "data_gap_recovery": 658.78, "data_gap_recovery_extended": 886.68, "argument_transformation": 1452.22, "grounded_synthesis": 1610.94, "inconsistent_api_recovery": 843.04, "relevance_detection_stateful": 137.02, "argument_fidelity_stateful": 362.32, "tool_selection_stateful": 325.32, "basic_2step_stateful": 230.6, "sequential_3step_stateful": 437.2, "conditional_routing_stateful": 794.38, "sequential_reasoning_stateful": 478.19, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 641.99, "data_gap_recovery_extended_stateful": 931.41, "argument_transformation_stateful": 1486.35, "grounded_synthesis_stateful": 1574.9, "inconsistent_api_recovery_stateful": 797.11}, "scenarioSpeedN": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-opus-4-6 AN/N [bare:full]", "model": "claude-opus-4-6", "backend": "anthropic", "mode": "native", "ablation": "bare", "replay": "full", "family": "claude", "quant": "n/a", "gen": 1, "retired": false, "score": 87.9, "accuracy": 95.8, "completeness": 91.8, "efficiency": 100.0, "wasted": 0.0, "speed": 16.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 98, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 100, "argument_transformation": 100, "grounded_synthesis": 100, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 98, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 96, "grounded_synthesis_stateful": 96, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 98, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 400, "argument_transformation": 250, "grounded_synthesis": 500, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 240, "grounded_synthesis_stateful": 480, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 98, "sequential_3step": 150, "conditional_routing": 100, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 151, "argument_transformation": 151, "grounded_synthesis": 150, "inconsistent_api_recovery": 246, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 147, "data_gap_recovery_extended_stateful": 150, "argument_transformation_stateful": 144, "grounded_synthesis_stateful": 144, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 123.04, "argument_fidelity": 442.53, "tool_selection": 489.76, "basic_2step": 438.17, "sequential_3step": 532.18, "conditional_routing": 614.81, "sequential_reasoning": 554.22, "error_recovery": 0.0, "data_gap_recovery": 769.41, "data_gap_recovery_extended": 902.93, "argument_transformation": 1036.02, "grounded_synthesis": 1839.76, "inconsistent_api_recovery": 1041.4, "relevance_detection_stateful": 130.52, "argument_fidelity_stateful": 405.75, "tool_selection_stateful": 334.24, "basic_2step_stateful": 598.51, "sequential_3step_stateful": 682.77, "conditional_routing_stateful": 691.1, "sequential_reasoning_stateful": 546.89, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 702.07, "data_gap_recovery_extended_stateful": 1094.57, "argument_transformation_stateful": 1544.11, "grounded_synthesis_stateful": 3220.45, "inconsistent_api_recovery_stateful": 911.51}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-sonnet-4-6 AN/N [bare]", "model": "claude-sonnet-4-6", "backend": "anthropic", "mode": "native", "ablation": "bare", "replay": "none", "family": "claude", "quant": "n/a", "gen": 3, "retired": false, "score": 88.4, "accuracy": 95.8, "completeness": 92.3, "efficiency": 100.0, "wasted": 0.0, "speed": 18.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 100, "argument_transformation": 98, "grounded_synthesis": 100, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 100, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 400, "argument_transformation": 245, "grounded_synthesis": 500, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 250, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 102, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 181, "argument_transformation": 147, "grounded_synthesis": 151, "inconsistent_api_recovery": 200, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 102, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 170, "argument_transformation_stateful": 150, "grounded_synthesis_stateful": 151, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 150.93, "argument_fidelity": 368.62, "tool_selection": 350.8, "basic_2step": 246.11, "sequential_3step": 387.18, "conditional_routing": 684.43, "sequential_reasoning": 575.03, "error_recovery": 0.0, "data_gap_recovery": 839.81, "data_gap_recovery_extended": 1129.53, "argument_transformation": 2132.89, "grounded_synthesis": 2646.22, "inconsistent_api_recovery": 1442.42, "relevance_detection_stateful": 163.93, "argument_fidelity_stateful": 342.19, "tool_selection_stateful": 347.62, "basic_2step_stateful": 227.88, "sequential_3step_stateful": 340.86, "conditional_routing_stateful": 676.19, "sequential_reasoning_stateful": 558.46, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 814.1, "data_gap_recovery_extended_stateful": 1104.4, "argument_transformation_stateful": 2101.14, "grounded_synthesis_stateful": 2589.26, "inconsistent_api_recovery_stateful": 1378.0}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-27B-Q4_K_M LS/P [reforged:full]", "model": "Qwen3.5-27B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "qwen3.5-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 86.8, "accuracy": 86.8, "completeness": 100.0, "efficiency": 100.0, "wasted": 0.1, "speed": 24.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 42, "argument_transformation": 10, "grounded_synthesis": 78, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 80, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 21, "argument_transformation": 5, "grounded_synthesis": 39, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 18, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 168, "argument_transformation": 25, "grounded_synthesis": 390, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 144, "argument_transformation_stateful": 25, "grounded_synthesis_stateful": 400, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 194, "sequential_reasoning": 200, "error_recovery": 150, "data_gap_recovery": 234, "data_gap_recovery_extended": 86, "argument_transformation": 19, "grounded_synthesis": 232, "inconsistent_api_recovery": 315, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 234, "data_gap_recovery_extended_stateful": 73, "argument_transformation_stateful": 18, "grounded_synthesis_stateful": 232, "inconsistent_api_recovery_stateful": 322}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 24.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 1.0, "grounded_synthesis": 1.0, "inconsistent_api_recovery": 8.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 23.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 5.0, "grounded_synthesis_stateful": 3.0, "inconsistent_api_recovery_stateful": 12.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 221.73, "argument_fidelity": 485.12, "tool_selection": 342.97, "basic_2step": 274.45, "sequential_3step": 478.63, "conditional_routing": 1301.97, "sequential_reasoning": 667.1, "error_recovery": 616.97, "data_gap_recovery": 1507.24, "data_gap_recovery_extended": 1727.18, "argument_transformation": 2545.03, "grounded_synthesis": 2936.59, "inconsistent_api_recovery": 2370.72, "relevance_detection_stateful": 209.71, "argument_fidelity_stateful": 494.69, "tool_selection_stateful": 346.25, "basic_2step_stateful": 259.08, "sequential_3step_stateful": 528.54, "conditional_routing_stateful": 1334.27, "sequential_reasoning_stateful": 683.83, "error_recovery_stateful": 632.94, "data_gap_recovery_stateful": 1450.28, "data_gap_recovery_extended_stateful": 1656.15, "argument_transformation_stateful": 3058.14, "grounded_synthesis_stateful": 3179.33, "inconsistent_api_recovery_stateful": 2373.82}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-opus-4-6 AN/N [bare+any:full]", "model": "claude-opus-4-6", "backend": "anthropic", "mode": "native", "ablation": "bare", "replay": "full", "family": "claude", "quant": "n/a", "gen": 1, "retired": false, "score": 87.1, "accuracy": 95.4, "completeness": 91.3, "efficiency": 100.0, "wasted": 0.0, "speed": 12.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 100, "argument_transformation": 80, "grounded_synthesis": 100, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 86, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 400, "argument_transformation": 200, "grounded_synthesis": 500, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 215, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 100, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 197, "argument_transformation": 219, "grounded_synthesis": 151, "inconsistent_api_recovery": 200, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 198, "argument_transformation_stateful": 241, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 21.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 26.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 260.72, "argument_fidelity": 477.63, "tool_selection": 418.17, "basic_2step": 200.47, "sequential_3step": 589.82, "conditional_routing": 581.21, "sequential_reasoning": 524.65, "error_recovery": 0.0, "data_gap_recovery": 757.27, "data_gap_recovery_extended": 1074.6, "argument_transformation": 779.87, "grounded_synthesis": 1425.95, "inconsistent_api_recovery": 708.76, "relevance_detection_stateful": 131.27, "argument_fidelity_stateful": 807.86, "tool_selection_stateful": 470.88, "basic_2step_stateful": 282.28, "sequential_3step_stateful": 468.06, "conditional_routing_stateful": 545.67, "sequential_reasoning_stateful": 541.4, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 613.07, "data_gap_recovery_extended_stateful": 878.83, "argument_transformation_stateful": 818.26, "grounded_synthesis_stateful": 1367.87, "inconsistent_api_recovery_stateful": 627.95}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "none", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 84.5, "accuracy": 84.8, "completeness": 99.7, "efficiency": 95.5, "wasted": 0.6, "speed": 5.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 96, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 98, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 76, "argument_transformation": 18, "grounded_synthesis": 44, "inconsistent_api_recovery": 92, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 80, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 54}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 38, "argument_transformation": 9, "grounded_synthesis": 22, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 21, "inconsistent_api_recovery_stateful": 27}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 144, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 196, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 304, "argument_transformation": 45, "grounded_synthesis": 220, "inconsistent_api_recovery": 368, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 320, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 210, "inconsistent_api_recovery_stateful": 216}, "scenarioActualCalls": {"relevance_detection": 96, "argument_fidelity": 150, "tool_selection": 148, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 237, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 226, "data_gap_recovery_extended": 226, "argument_transformation": 47, "grounded_synthesis": 161, "inconsistent_api_recovery": 522, "relevance_detection_stateful": 96, "argument_fidelity_stateful": 150, "tool_selection_stateful": 152, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 229, "sequential_reasoning_stateful": 245, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 222, "data_gap_recovery_extended_stateful": 221, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 140, "inconsistent_api_recovery_stateful": 302}, "scenarioWastedSum": {"relevance_detection": 46.0, "argument_fidelity": 0.0, "tool_selection": 4.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 48.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 3.0, "data_gap_recovery_extended": 6.0, "argument_transformation": 32.0, "grounded_synthesis": 29.0, "inconsistent_api_recovery": 156.0, "relevance_detection_stateful": 46.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 2.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 40.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 7.0, "argument_transformation_stateful": 38.0, "grounded_synthesis_stateful": 8.0, "inconsistent_api_recovery_stateful": 153.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 40.93, "argument_fidelity": 50.53, "tool_selection": 37.98, "basic_2step": 22.53, "sequential_3step": 48.73, "conditional_routing": 280.51, "sequential_reasoning": 256.61, "error_recovery": 29.47, "data_gap_recovery": 380.27, "data_gap_recovery_extended": 532.14, "argument_transformation": 703.11, "grounded_synthesis": 514.47, "inconsistent_api_recovery": 476.09, "relevance_detection_stateful": 41.93, "argument_fidelity_stateful": 50.14, "tool_selection_stateful": 40.96, "basic_2step_stateful": 26.06, "sequential_3step_stateful": 52.63, "conditional_routing_stateful": 297.63, "sequential_reasoning_stateful": 266.58, "error_recovery_stateful": 29.48, "data_gap_recovery_stateful": 401.31, "data_gap_recovery_extended_stateful": 512.5, "argument_transformation_stateful": 720.27, "grounded_synthesis_stateful": 556.88, "inconsistent_api_recovery_stateful": 441.2}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged:keep-last]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "keep-last", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 84.8, "accuracy": 85.6, "completeness": 99.1, "efficiency": 94.3, "wasted": 0.6, "speed": 5.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 98, "data_gap_recovery_extended": 70, "argument_transformation": 24, "grounded_synthesis": 42, "inconsistent_api_recovery": 86, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 82, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 36, "inconsistent_api_recovery_stateful": 62}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 35, "argument_transformation": 12, "grounded_synthesis": 21, "inconsistent_api_recovery": 43, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 41, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 18, "inconsistent_api_recovery_stateful": 31}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 42}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 42}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 245, "data_gap_recovery_extended": 280, "argument_transformation": 60, "grounded_synthesis": 210, "inconsistent_api_recovery": 344, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 328, "argument_transformation_stateful": 15, "grounded_synthesis_stateful": 180, "inconsistent_api_recovery_stateful": 248}, "scenarioActualCalls": {"relevance_detection": 99, "argument_fidelity": 150, "tool_selection": 151, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 242, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 218, "data_gap_recovery_extended": 206, "argument_transformation": 48, "grounded_synthesis": 148, "inconsistent_api_recovery": 501, "relevance_detection_stateful": 98, "argument_fidelity_stateful": 150, "tool_selection_stateful": 152, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 232, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 151, "data_gap_recovery_stateful": 219, "data_gap_recovery_extended_stateful": 249, "argument_transformation_stateful": 15, "grounded_synthesis_stateful": 117, "inconsistent_api_recovery_stateful": 381}, "scenarioWastedSum": {"relevance_detection": 49.0, "argument_fidelity": 0.0, "tool_selection": 1.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 48.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 4.0, "argument_transformation": 18.0, "grounded_synthesis": 17.0, "inconsistent_api_recovery": 161.0, "relevance_detection_stateful": 48.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 2.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 43.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 1.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 4.0, "argument_transformation_stateful": 12.0, "grounded_synthesis_stateful": 13.0, "inconsistent_api_recovery_stateful": 165.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 42}, "scenarioSpeedSum": {"relevance_detection": 46.13, "argument_fidelity": 49.69, "tool_selection": 37.61, "basic_2step": 22.54, "sequential_3step": 47.43, "conditional_routing": 317.01, "sequential_reasoning": 254.07, "error_recovery": 30.48, "data_gap_recovery": 386.24, "data_gap_recovery_extended": 524.88, "argument_transformation": 739.5, "grounded_synthesis": 602.57, "inconsistent_api_recovery": 732.67, "relevance_detection_stateful": 43.63, "argument_fidelity_stateful": 48.8, "tool_selection_stateful": 40.62, "basic_2step_stateful": 26.06, "sequential_3step_stateful": 47.69, "conditional_routing_stateful": 347.62, "sequential_reasoning_stateful": 274.69, "error_recovery_stateful": 30.83, "data_gap_recovery_stateful": 391.96, "data_gap_recovery_extended_stateful": 508.77, "argument_transformation_stateful": 749.81, "grounded_synthesis_stateful": 570.74, "inconsistent_api_recovery_stateful": 663.96}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 42}}, {"label": "Ministral-3-8B-Instruct-2512-Q8_0 LS/P [reforged]", "model": "Ministral-3-8B-Instruct-2512-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "none", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 84.2, "accuracy": 90.9, "completeness": 92.7, "efficiency": 91.3, "wasted": 0.7, "speed": 4.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 6, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 98, "argument_transformation": 8, "grounded_synthesis": 100, "inconsistent_api_recovery": 80, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 4, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 96, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 98, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 3, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 4, "grounded_synthesis": 50, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 3, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 3, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 9, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 392, "argument_transformation": 20, "grounded_synthesis": 500, "inconsistent_api_recovery": 320, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 6, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 384, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 490, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 15, "basic_2step": 100, "sequential_3step": 152, "conditional_routing": 250, "sequential_reasoning": 331, "error_recovery": 150, "data_gap_recovery": 159, "data_gap_recovery_extended": 329, "argument_transformation": 17, "grounded_synthesis": 611, "inconsistent_api_recovery": 395, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 10, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 343, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 157, "data_gap_recovery_extended_stateful": 324, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 599, "inconsistent_api_recovery_stateful": 455}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 6.0, "basic_2step": 0.0, "sequential_3step": 2.0, "conditional_routing": 50.0, "sequential_reasoning": 131.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 111.0, "inconsistent_api_recovery": 93.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 4.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 143.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 2.0, "grounded_synthesis_stateful": 113.0, "inconsistent_api_recovery_stateful": 93.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 3, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 16.78, "argument_fidelity": 64.45, "tool_selection": 9.1, "basic_2step": 29.26, "sequential_3step": 94.81, "conditional_routing": 154.34, "sequential_reasoning": 257.79, "error_recovery": 35.54, "data_gap_recovery": 211.3, "data_gap_recovery_extended": 594.58, "argument_transformation": 293.95, "grounded_synthesis": 763.93, "inconsistent_api_recovery": 268.91, "relevance_detection_stateful": 16.76, "argument_fidelity_stateful": 64.46, "tool_selection_stateful": 5.61, "basic_2step_stateful": 32.57, "sequential_3step_stateful": 87.84, "conditional_routing_stateful": 151.59, "sequential_reasoning_stateful": 273.93, "error_recovery_stateful": 35.55, "data_gap_recovery_stateful": 209.11, "data_gap_recovery_extended_stateful": 599.63, "argument_transformation_stateful": 294.53, "grounded_synthesis_stateful": 750.3, "inconsistent_api_recovery_stateful": 269.04}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 3, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-opus-4-8 AN/N [bare+any]", "model": "claude-opus-4-8", "backend": "anthropic", "mode": "native", "ablation": "bare", "replay": "none", "family": "claude", "quant": "n/a", "gen": 3, "retired": false, "score": 83.6, "accuracy": 90.7, "completeness": 92.2, "efficiency": 100.0, "wasted": 0.0, "speed": 9.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 100, "argument_transformation": 30, "grounded_synthesis": 100, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 15, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 22, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 400, "argument_transformation": 75, "grounded_synthesis": 500, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 110, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 134, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 166, "argument_transformation": 74, "grounded_synthesis": 150, "inconsistent_api_recovery": 200, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 144, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 167, "argument_transformation_stateful": 113, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 24.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 34.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 120.42, "argument_fidelity": 276.11, "tool_selection": 326.67, "basic_2step": 150.32, "sequential_3step": 377.31, "conditional_routing": 552.82, "sequential_reasoning": 413.15, "error_recovery": 0.0, "data_gap_recovery": 466.73, "data_gap_recovery_extended": 590.92, "argument_transformation": 643.05, "grounded_synthesis": 1062.55, "inconsistent_api_recovery": 582.18, "relevance_detection_stateful": 114.59, "argument_fidelity_stateful": 292.79, "tool_selection_stateful": 301.93, "basic_2step_stateful": 165.34, "sequential_3step_stateful": 371.38, "conditional_routing_stateful": 597.71, "sequential_reasoning_stateful": 381.34, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 454.31, "data_gap_recovery_extended_stateful": 555.47, "argument_transformation_stateful": 690.14, "grounded_synthesis_stateful": 1110.72, "inconsistent_api_recovery_stateful": 513.84}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-35B-A3B-Q4_K_M LS/P [reforged:full]", "model": "Qwen3.5-35B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "qwen3.5-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 82.8, "accuracy": 82.8, "completeness": 100.0, "efficiency": 100.0, "wasted": 0.2, "speed": 10.4, "n": 50, "scenarios": {"relevance_detection": 48, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 98, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 74, "argument_transformation": 16, "grounded_synthesis": 62, "inconsistent_api_recovery": 90, "relevance_detection_stateful": 56, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 68, "argument_transformation_stateful": 14, "grounded_synthesis_stateful": 58, "inconsistent_api_recovery_stateful": 82}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 24, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 37, "argument_transformation": 8, "grounded_synthesis": 31, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 28, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 7, "grounded_synthesis_stateful": 29, "inconsistent_api_recovery_stateful": 41}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 24, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 196, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 296, "argument_transformation": 40, "grounded_synthesis": 310, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 28, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 272, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 290, "inconsistent_api_recovery_stateful": 328}, "scenarioActualCalls": {"relevance_detection": 25, "argument_fidelity": 152, "tool_selection": 201, "basic_2step": 101, "sequential_3step": 150, "conditional_routing": 190, "sequential_reasoning": 197, "error_recovery": 161, "data_gap_recovery": 195, "data_gap_recovery_extended": 154, "argument_transformation": 29, "grounded_synthesis": 183, "inconsistent_api_recovery": 283, "relevance_detection_stateful": 30, "argument_fidelity_stateful": 152, "tool_selection_stateful": 200, "basic_2step_stateful": 109, "sequential_3step_stateful": 152, "conditional_routing_stateful": 198, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 159, "data_gap_recovery_stateful": 211, "data_gap_recovery_extended_stateful": 132, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 146, "inconsistent_api_recovery_stateful": 256}, "scenarioWastedSum": {"relevance_detection": 2.0, "argument_fidelity": 2.0, "tool_selection": 51.0, "basic_2step": 1.0, "sequential_3step": 0.0, "conditional_routing": 28.0, "sequential_reasoning": 1.0, "error_recovery": 61.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 2.0, "grounded_synthesis": 9.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 2.0, "argument_fidelity_stateful": 2.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 9.0, "sequential_3step_stateful": 2.0, "conditional_routing_stateful": 30.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 9.0, "data_gap_recovery_stateful": 5.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 1.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 60.9, "argument_fidelity": 177.92, "tool_selection": 172.67, "basic_2step": 429.56, "sequential_3step": 489.02, "conditional_routing": 476.02, "sequential_reasoning": 320.68, "error_recovery": 596.7, "data_gap_recovery": 444.68, "data_gap_recovery_extended": 542.16, "argument_transformation": 1056.25, "grounded_synthesis": 1059.21, "inconsistent_api_recovery": 1016.89, "relevance_detection_stateful": 56.3, "argument_fidelity_stateful": 283.08, "tool_selection_stateful": 171.91, "basic_2step_stateful": 274.07, "sequential_3step_stateful": 417.92, "conditional_routing_stateful": 488.83, "sequential_reasoning_stateful": 357.79, "error_recovery_stateful": 559.77, "data_gap_recovery_stateful": 440.86, "data_gap_recovery_extended_stateful": 593.31, "argument_transformation_stateful": 1069.76, "grounded_synthesis_stateful": 1053.18, "inconsistent_api_recovery_stateful": 879.17}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged:full]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 83.2, "accuracy": 83.2, "completeness": 100.0, "efficiency": 97.0, "wasted": 0.6, "speed": 5.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 86, "sequential_reasoning": 98, "error_recovery": 100, "data_gap_recovery": 68, "data_gap_recovery_extended": 40, "argument_transformation": 32, "grounded_synthesis": 76, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 92, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 62, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 68, "inconsistent_api_recovery_stateful": 82}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 34, "data_gap_recovery_extended": 20, "argument_transformation": 16, "grounded_synthesis": 38, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 31, "data_gap_recovery_extended_stateful": 17, "argument_transformation_stateful": 15, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 41}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 172, "sequential_reasoning": 196, "error_recovery": 100, "data_gap_recovery": 170, "data_gap_recovery_extended": 160, "argument_transformation": 80, "grounded_synthesis": 380, "inconsistent_api_recovery": 384, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 184, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 155, "data_gap_recovery_extended_stateful": 136, "argument_transformation_stateful": 75, "grounded_synthesis_stateful": 340, "inconsistent_api_recovery_stateful": 328}, "scenarioActualCalls": {"relevance_detection": 102, "argument_fidelity": 150, "tool_selection": 172, "basic_2step": 113, "sequential_3step": 152, "conditional_routing": 193, "sequential_reasoning": 245, "error_recovery": 150, "data_gap_recovery": 179, "data_gap_recovery_extended": 102, "argument_transformation": 71, "grounded_synthesis": 202, "inconsistent_api_recovery": 504, "relevance_detection_stateful": 101, "argument_fidelity_stateful": 150, "tool_selection_stateful": 169, "basic_2step_stateful": 105, "sequential_3step_stateful": 150, "conditional_routing_stateful": 183, "sequential_reasoning_stateful": 249, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 163, "data_gap_recovery_extended_stateful": 98, "argument_transformation_stateful": 72, "grounded_synthesis_stateful": 188, "inconsistent_api_recovery_stateful": 435}, "scenarioWastedSum": {"relevance_detection": 52.0, "argument_fidelity": 0.0, "tool_selection": 22.0, "basic_2step": 13.0, "sequential_3step": 2.0, "conditional_routing": 37.0, "sequential_reasoning": 49.0, "error_recovery": 50.0, "data_gap_recovery": 30.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 8.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 141.0, "relevance_detection_stateful": 51.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 19.0, "basic_2step_stateful": 5.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 22.0, "sequential_reasoning_stateful": 49.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 42.0, "data_gap_recovery_extended_stateful": 4.0, "argument_transformation_stateful": 15.0, "grounded_synthesis_stateful": 5.0, "inconsistent_api_recovery_stateful": 127.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 26.29, "argument_fidelity": 48.66, "tool_selection": 50.01, "basic_2step": 25.63, "sequential_3step": 59.08, "conditional_routing": 320.5, "sequential_reasoning": 195.8, "error_recovery": 30.41, "data_gap_recovery": 308.61, "data_gap_recovery_extended": 452.83, "argument_transformation": 754.66, "grounded_synthesis": 840.16, "inconsistent_api_recovery": 545.69, "relevance_detection_stateful": 27.27, "argument_fidelity_stateful": 49.6, "tool_selection_stateful": 48.71, "basic_2step_stateful": 28.02, "sequential_3step_stateful": 61.99, "conditional_routing_stateful": 311.83, "sequential_reasoning_stateful": 190.03, "error_recovery_stateful": 31.34, "data_gap_recovery_stateful": 309.06, "data_gap_recovery_extended_stateful": 447.26, "argument_transformation_stateful": 730.58, "grounded_synthesis_stateful": 799.36, "inconsistent_api_recovery_stateful": 539.81}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "none", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 83.3, "accuracy": 83.3, "completeness": 100.0, "efficiency": 96.5, "wasted": 0.6, "speed": 4.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 60, "data_gap_recovery_extended": 32, "argument_transformation": 34, "grounded_synthesis": 78, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 92, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 62, "data_gap_recovery_extended_stateful": 30, "argument_transformation_stateful": 28, "grounded_synthesis_stateful": 78, "inconsistent_api_recovery_stateful": 78}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 30, "data_gap_recovery_extended": 16, "argument_transformation": 17, "grounded_synthesis": 39, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 31, "data_gap_recovery_extended_stateful": 15, "argument_transformation_stateful": 14, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 39}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 150, "data_gap_recovery_extended": 128, "argument_transformation": 85, "grounded_synthesis": 390, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 184, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 155, "data_gap_recovery_extended_stateful": 120, "argument_transformation_stateful": 70, "grounded_synthesis_stateful": 390, "inconsistent_api_recovery_stateful": 312}, "scenarioActualCalls": {"relevance_detection": 102, "argument_fidelity": 150, "tool_selection": 173, "basic_2step": 117, "sequential_3step": 152, "conditional_routing": 212, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 163, "data_gap_recovery_extended": 92, "argument_transformation": 92, "grounded_synthesis": 234, "inconsistent_api_recovery": 485, "relevance_detection_stateful": 102, "argument_fidelity_stateful": 150, "tool_selection_stateful": 171, "basic_2step_stateful": 104, "sequential_3step_stateful": 152, "conditional_routing_stateful": 208, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 167, "data_gap_recovery_extended_stateful": 76, "argument_transformation_stateful": 72, "grounded_synthesis_stateful": 209, "inconsistent_api_recovery_stateful": 389}, "scenarioWastedSum": {"relevance_detection": 52.0, "argument_fidelity": 0.0, "tool_selection": 23.0, "basic_2step": 17.0, "sequential_3step": 2.0, "conditional_routing": 33.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 27.0, "data_gap_recovery_extended": 2.0, "argument_transformation": 15.0, "grounded_synthesis": 6.0, "inconsistent_api_recovery": 131.0, "relevance_detection_stateful": 52.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 21.0, "basic_2step_stateful": 4.0, "sequential_3step_stateful": 2.0, "conditional_routing_stateful": 40.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 30.0, "data_gap_recovery_extended_stateful": 6.0, "argument_transformation_stateful": 24.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 109.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 27.61, "argument_fidelity": 49.32, "tool_selection": 51.81, "basic_2step": 26.75, "sequential_3step": 49.57, "conditional_routing": 279.49, "sequential_reasoning": 194.69, "error_recovery": 29.85, "data_gap_recovery": 248.33, "data_gap_recovery_extended": 374.27, "argument_transformation": 632.12, "grounded_synthesis": 727.52, "inconsistent_api_recovery": 341.62, "relevance_detection_stateful": 26.92, "argument_fidelity_stateful": 49.32, "tool_selection_stateful": 50.37, "basic_2step_stateful": 28.3, "sequential_3step_stateful": 50.97, "conditional_routing_stateful": 314.07, "sequential_reasoning_stateful": 197.43, "error_recovery_stateful": 32.24, "data_gap_recovery_stateful": 267.29, "data_gap_recovery_extended_stateful": 375.64, "argument_transformation_stateful": 675.51, "grounded_synthesis_stateful": 795.16, "inconsistent_api_recovery_stateful": 351.45}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged:full]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 83.1, "accuracy": 83.1, "completeness": 99.9, "efficiency": 95.5, "wasted": 0.5, "speed": 6.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 98, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 66, "argument_transformation": 20, "grounded_synthesis": 36, "inconsistent_api_recovery": 88, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 74, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 52}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 33, "argument_transformation": 10, "grounded_synthesis": 18, "inconsistent_api_recovery": 44, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 37, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 13, "inconsistent_api_recovery_stateful": 26}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 196, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 264, "argument_transformation": 50, "grounded_synthesis": 180, "inconsistent_api_recovery": 352, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 296, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 130, "inconsistent_api_recovery_stateful": 208}, "scenarioActualCalls": {"relevance_detection": 98, "argument_fidelity": 150, "tool_selection": 154, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 237, "sequential_reasoning": 245, "error_recovery": 150, "data_gap_recovery": 227, "data_gap_recovery_extended": 173, "argument_transformation": 38, "grounded_synthesis": 121, "inconsistent_api_recovery": 478, "relevance_detection_stateful": 99, "argument_fidelity_stateful": 150, "tool_selection_stateful": 156, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 242, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 216, "data_gap_recovery_extended_stateful": 205, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 81, "inconsistent_api_recovery_stateful": 292}, "scenarioWastedSum": {"relevance_detection": 48.0, "argument_fidelity": 0.0, "tool_selection": 4.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 44.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 13.0, "grounded_synthesis": 8.0, "inconsistent_api_recovery": 146.0, "relevance_detection_stateful": 49.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 6.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 46.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 2.0, "argument_transformation_stateful": 24.0, "grounded_synthesis_stateful": 13.0, "inconsistent_api_recovery_stateful": 135.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 42.52, "argument_fidelity": 50.72, "tool_selection": 40.48, "basic_2step": 22.59, "sequential_3step": 51.83, "conditional_routing": 289.72, "sequential_reasoning": 264.65, "error_recovery": 31.07, "data_gap_recovery": 398.86, "data_gap_recovery_extended": 546.12, "argument_transformation": 784.53, "grounded_synthesis": 568.38, "inconsistent_api_recovery": 760.01, "relevance_detection_stateful": 42.98, "argument_fidelity_stateful": 48.93, "tool_selection_stateful": 43.62, "basic_2step_stateful": 26.1, "sequential_3step_stateful": 49.89, "conditional_routing_stateful": 357.32, "sequential_reasoning_stateful": 280.92, "error_recovery_stateful": 30.53, "data_gap_recovery_stateful": 406.42, "data_gap_recovery_extended_stateful": 514.61, "argument_transformation_stateful": 776.59, "grounded_synthesis_stateful": 537.28, "inconsistent_api_recovery_stateful": 807.65}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged:keep-last]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "keep-last", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 82.9, "accuracy": 82.9, "completeness": 100.0, "efficiency": 95.1, "wasted": 0.6, "speed": 5.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 92, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 68, "data_gap_recovery_extended": 40, "argument_transformation": 30, "grounded_synthesis": 60, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 62, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 72, "inconsistent_api_recovery_stateful": 86}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 46, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 34, "data_gap_recovery_extended": 20, "argument_transformation": 15, "grounded_synthesis": 30, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 31, "data_gap_recovery_extended_stateful": 18, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 36, "inconsistent_api_recovery_stateful": 43}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 184, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 170, "data_gap_recovery_extended": 160, "argument_transformation": 75, "grounded_synthesis": 300, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 155, "data_gap_recovery_extended_stateful": 144, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 360, "inconsistent_api_recovery_stateful": 344}, "scenarioActualCalls": {"relevance_detection": 101, "argument_fidelity": 150, "tool_selection": 177, "basic_2step": 120, "sequential_3step": 151, "conditional_routing": 198, "sequential_reasoning": 247, "error_recovery": 150, "data_gap_recovery": 181, "data_gap_recovery_extended": 132, "argument_transformation": 74, "grounded_synthesis": 161, "inconsistent_api_recovery": 506, "relevance_detection_stateful": 101, "argument_fidelity_stateful": 150, "tool_selection_stateful": 168, "basic_2step_stateful": 101, "sequential_3step_stateful": 152, "conditional_routing_stateful": 215, "sequential_reasoning_stateful": 244, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 155, "data_gap_recovery_extended_stateful": 109, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 189, "inconsistent_api_recovery_stateful": 452}, "scenarioWastedSum": {"relevance_detection": 51.0, "argument_fidelity": 0.0, "tool_selection": 27.0, "basic_2step": 20.0, "sequential_3step": 1.0, "conditional_routing": 31.0, "sequential_reasoning": 47.0, "error_recovery": 50.0, "data_gap_recovery": 37.0, "data_gap_recovery_extended": 7.0, "argument_transformation": 12.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 149.0, "relevance_detection_stateful": 51.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 18.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 2.0, "conditional_routing_stateful": 37.0, "sequential_reasoning_stateful": 44.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 25.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 16.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 130.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 25.98, "argument_fidelity": 47.13, "tool_selection": 52.28, "basic_2step": 26.56, "sequential_3step": 55.23, "conditional_routing": 323.49, "sequential_reasoning": 189.11, "error_recovery": 29.36, "data_gap_recovery": 275.68, "data_gap_recovery_extended": 447.78, "argument_transformation": 678.47, "grounded_synthesis": 695.69, "inconsistent_api_recovery": 415.86, "relevance_detection_stateful": 26.98, "argument_fidelity_stateful": 48.23, "tool_selection_stateful": 47.11, "basic_2step_stateful": 26.06, "sequential_3step_stateful": 67.43, "conditional_routing_stateful": 309.31, "sequential_reasoning_stateful": 185.42, "error_recovery_stateful": 31.38, "data_gap_recovery_stateful": 248.33, "data_gap_recovery_extended_stateful": 436.12, "argument_transformation_stateful": 656.33, "grounded_synthesis_stateful": 790.75, "inconsistent_api_recovery_stateful": 398.18}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.6-27B-Q4_K_M LS/P [reforged:full]", "model": "Qwen3.6-27B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "qwen3.6-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 83.5, "accuracy": 85.0, "completeness": 98.2, "efficiency": 97.0, "wasted": 0.4, "speed": 53.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 98, "data_gap_recovery_extended": 6, "argument_transformation": 66, "grounded_synthesis": 52, "inconsistent_api_recovery": 90, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 96, "data_gap_recovery_stateful": 90, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 56, "grounded_synthesis_stateful": 36, "inconsistent_api_recovery_stateful": 80}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 3, "argument_transformation": 33, "grounded_synthesis": 26, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 28, "grounded_synthesis_stateful": 18, "inconsistent_api_recovery_stateful": 40}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 245, "data_gap_recovery_extended": 24, "argument_transformation": 165, "grounded_synthesis": 260, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 140, "grounded_synthesis_stateful": 180, "inconsistent_api_recovery_stateful": 320}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 281, "sequential_reasoning": 200, "error_recovery": 151, "data_gap_recovery": 237, "data_gap_recovery_extended": 12, "argument_transformation": 143, "grounded_synthesis": 251, "inconsistent_api_recovery": 369, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 155, "conditional_routing_stateful": 277, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 145, "data_gap_recovery_stateful": 218, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 113, "grounded_synthesis_stateful": 166, "inconsistent_api_recovery_stateful": 323}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 82.0, "sequential_reasoning": 0.0, "error_recovery": 51.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 6.0, "grounded_synthesis": 66.0, "inconsistent_api_recovery": 52.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 5.0, "conditional_routing_stateful": 81.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 1.0, "data_gap_recovery_stateful": 6.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 3.0, "grounded_synthesis_stateful": 57.0, "inconsistent_api_recovery_stateful": 36.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}, "scenarioSpeedSum": {"relevance_detection": 394.79, "argument_fidelity": 569.94, "tool_selection": 439.43, "basic_2step": 513.59, "sequential_3step": 761.2, "conditional_routing": 2251.22, "sequential_reasoning": 1014.67, "error_recovery": 1982.98, "data_gap_recovery": 2397.0, "data_gap_recovery_extended": 3303.2, "argument_transformation": 6040.23, "grounded_synthesis": 5500.19, "inconsistent_api_recovery": 9925.29, "relevance_detection_stateful": 449.83, "argument_fidelity_stateful": 580.21, "tool_selection_stateful": 449.79, "basic_2step_stateful": 1244.72, "sequential_3step_stateful": 829.13, "conditional_routing_stateful": 2522.43, "sequential_reasoning_stateful": 966.14, "error_recovery_stateful": 1790.76, "data_gap_recovery_stateful": 2280.2, "data_gap_recovery_extended_stateful": 3180.8, "argument_transformation_stateful": 5799.48, "grounded_synthesis_stateful": 5363.13, "inconsistent_api_recovery_stateful": 8271.52}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}}, {"label": "Qwen3.6-35B-A3B-UD-Q4_K_M LS/P [reforged:full]", "model": "Qwen3.6-35B-A3B-UD-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "qwen3.6-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 82.2, "accuracy": 82.2, "completeness": 100.0, "efficiency": 100.0, "wasted": 0.3, "speed": 23.6, "n": 50, "scenarios": {"relevance_detection": 96, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 90, "sequential_reasoning": 92, "error_recovery": 98, "data_gap_recovery": 92, "data_gap_recovery_extended": 16, "argument_transformation": 46, "grounded_synthesis": 62, "inconsistent_api_recovery": 98, "relevance_detection_stateful": 88, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 88, "sequential_reasoning_stateful": 94, "error_recovery_stateful": 96, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 94}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 48, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 45, "sequential_reasoning": 46, "error_recovery": 49, "data_gap_recovery": 46, "data_gap_recovery_extended": 8, "argument_transformation": 23, "grounded_synthesis": 31, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 47}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 48, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 180, "sequential_reasoning": 184, "error_recovery": 98, "data_gap_recovery": 230, "data_gap_recovery_extended": 64, "argument_transformation": 115, "grounded_synthesis": 310, "inconsistent_api_recovery": 392, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 176, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 105, "grounded_synthesis_stateful": 250, "inconsistent_api_recovery_stateful": 376}, "scenarioActualCalls": {"relevance_detection": 48, "argument_fidelity": 150, "tool_selection": 189, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 174, "sequential_reasoning": 184, "error_recovery": 150, "data_gap_recovery": 186, "data_gap_recovery_extended": 32, "argument_transformation": 84, "grounded_synthesis": 160, "inconsistent_api_recovery": 442, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 150, "tool_selection_stateful": 176, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 187, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 166, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 81, "grounded_synthesis_stateful": 120, "inconsistent_api_recovery_stateful": 398}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 39.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 28.0, "sequential_reasoning": 0.0, "error_recovery": 54.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 78.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 26.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 40.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 4.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 1.0, "inconsistent_api_recovery_stateful": 56.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 421.31, "argument_fidelity": 311.17, "tool_selection": 533.83, "basic_2step": 405.51, "sequential_3step": 506.4, "conditional_routing": 647.68, "sequential_reasoning": 598.51, "error_recovery": 1819.19, "data_gap_recovery": 773.69, "data_gap_recovery_extended": 902.36, "argument_transformation": 2146.81, "grounded_synthesis": 2281.21, "inconsistent_api_recovery": 3795.79, "relevance_detection_stateful": 473.26, "argument_fidelity_stateful": 285.67, "tool_selection_stateful": 402.3, "basic_2step_stateful": 420.15, "sequential_3step_stateful": 413.66, "conditional_routing_stateful": 683.53, "sequential_reasoning_stateful": 432.88, "error_recovery_stateful": 2249.88, "data_gap_recovery_stateful": 816.62, "data_gap_recovery_extended_stateful": 1018.74, "argument_transformation_stateful": 2224.15, "grounded_synthesis_stateful": 2295.51, "inconsistent_api_recovery_stateful": 3800.19}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged:full]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 81.8, "accuracy": 81.8, "completeness": 100.0, "efficiency": 95.2, "wasted": 0.5, "speed": 4.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 98, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 98, "sequential_reasoning": 98, "error_recovery": 100, "data_gap_recovery": 98, "data_gap_recovery_extended": 62, "argument_transformation": 8, "grounded_synthesis": 30, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 98, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 96, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 62, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 46}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 49, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 31, "argument_transformation": 4, "grounded_synthesis": 15, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 31, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 23}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 147, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 196, "sequential_reasoning": 196, "error_recovery": 100, "data_gap_recovery": 245, "data_gap_recovery_extended": 248, "argument_transformation": 20, "grounded_synthesis": 150, "inconsistent_api_recovery": 384, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 248, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 200, "inconsistent_api_recovery_stateful": 184}, "scenarioActualCalls": {"relevance_detection": 93, "argument_fidelity": 150, "tool_selection": 152, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 225, "sequential_reasoning": 245, "error_recovery": 150, "data_gap_recovery": 220, "data_gap_recovery_extended": 169, "argument_transformation": 15, "grounded_synthesis": 104, "inconsistent_api_recovery": 521, "relevance_detection_stateful": 92, "argument_fidelity_stateful": 150, "tool_selection_stateful": 156, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 238, "sequential_reasoning_stateful": 241, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 223, "data_gap_recovery_extended_stateful": 162, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 128, "inconsistent_api_recovery_stateful": 276}, "scenarioWastedSum": {"relevance_detection": 43.0, "argument_fidelity": 0.0, "tool_selection": 5.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 42.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 11.0, "grounded_synthesis": 5.0, "inconsistent_api_recovery": 146.0, "relevance_detection_stateful": 42.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 9.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 47.0, "sequential_reasoning_stateful": 52.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 9.0, "argument_transformation_stateful": 12.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 165.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 27.34, "argument_fidelity": 33.84, "tool_selection": 27.89, "basic_2step": 15.09, "sequential_3step": 28.69, "conditional_routing": 308.41, "sequential_reasoning": 204.36, "error_recovery": 17.84, "data_gap_recovery": 262.36, "data_gap_recovery_extended": 397.66, "argument_transformation": 488.81, "grounded_synthesis": 398.04, "inconsistent_api_recovery": 515.64, "relevance_detection_stateful": 30.03, "argument_fidelity_stateful": 35.37, "tool_selection_stateful": 32.46, "basic_2step_stateful": 17.13, "sequential_3step_stateful": 28.58, "conditional_routing_stateful": 309.31, "sequential_reasoning_stateful": 228.71, "error_recovery_stateful": 18.57, "data_gap_recovery_stateful": 260.85, "data_gap_recovery_extended_stateful": 396.16, "argument_transformation_stateful": 521.32, "grounded_synthesis_stateful": 417.82, "inconsistent_api_recovery_stateful": 537.68}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged:keep-last]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "keep-last", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 82.4, "accuracy": 83.0, "completeness": 99.3, "efficiency": 92.2, "wasted": 0.6, "speed": 4.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 98, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 68, "argument_transformation": 16, "grounded_synthesis": 28, "inconsistent_api_recovery": 86, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 96, "data_gap_recovery_extended_stateful": 64, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 62}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 34, "argument_transformation": 8, "grounded_synthesis": 14, "inconsistent_api_recovery": 43, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 31}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 46}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 46}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 196, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 272, "argument_transformation": 40, "grounded_synthesis": 140, "inconsistent_api_recovery": 344, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 240, "data_gap_recovery_extended_stateful": 256, "argument_transformation_stateful": 15, "grounded_synthesis_stateful": 120, "inconsistent_api_recovery_stateful": 248}, "scenarioActualCalls": {"relevance_detection": 95, "argument_fidelity": 150, "tool_selection": 155, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 243, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 224, "data_gap_recovery_extended": 175, "argument_transformation": 42, "grounded_synthesis": 84, "inconsistent_api_recovery": 522, "relevance_detection_stateful": 92, "argument_fidelity_stateful": 150, "tool_selection_stateful": 159, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 231, "sequential_reasoning_stateful": 247, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 179, "argument_transformation_stateful": 15, "grounded_synthesis_stateful": 87, "inconsistent_api_recovery_stateful": 391}, "scenarioWastedSum": {"relevance_detection": 45.0, "argument_fidelity": 0.0, "tool_selection": 5.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 3.0, "data_gap_recovery_extended": 2.0, "argument_transformation": 19.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 186.0, "relevance_detection_stateful": 42.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 9.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 45.0, "sequential_reasoning_stateful": 52.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 5.0, "argument_transformation_stateful": 11.0, "grounded_synthesis_stateful": 9.0, "inconsistent_api_recovery_stateful": 190.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 46}, "scenarioSpeedSum": {"relevance_detection": 27.28, "argument_fidelity": 35.5, "tool_selection": 28.12, "basic_2step": 15.57, "sequential_3step": 28.74, "conditional_routing": 289.48, "sequential_reasoning": 233.65, "error_recovery": 18.38, "data_gap_recovery": 267.46, "data_gap_recovery_extended": 395.13, "argument_transformation": 482.39, "grounded_synthesis": 390.72, "inconsistent_api_recovery": 447.0, "relevance_detection_stateful": 29.15, "argument_fidelity_stateful": 34.27, "tool_selection_stateful": 33.99, "basic_2step_stateful": 17.6, "sequential_3step_stateful": 29.4, "conditional_routing_stateful": 283.75, "sequential_reasoning_stateful": 212.7, "error_recovery_stateful": 19.86, "data_gap_recovery_stateful": 256.93, "data_gap_recovery_extended_stateful": 392.91, "argument_transformation_stateful": 518.33, "grounded_synthesis_stateful": 434.23, "inconsistent_api_recovery_stateful": 457.32}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 46}}, {"label": "Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [reforged]", "model": "Ministral-3-14B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "none", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 80.6, "accuracy": 80.6, "completeness": 100.0, "efficiency": 100.0, "wasted": 0.0, "speed": 3.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 46, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 52, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 23, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 230, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 260, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 158, "sequential_reasoning": 200, "error_recovery": 150, "data_gap_recovery": 150, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 146, "inconsistent_api_recovery": 218, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 164, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 147, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 170, "inconsistent_api_recovery_stateful": 230}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 4.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 7.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 16.82, "argument_fidelity": 60.26, "tool_selection": 53.11, "basic_2step": 24.61, "sequential_3step": 70.79, "conditional_routing": 187.26, "sequential_reasoning": 108.02, "error_recovery": 38.57, "data_gap_recovery": 153.2, "data_gap_recovery_extended": 182.49, "argument_transformation": 212.34, "grounded_synthesis": 502.56, "inconsistent_api_recovery": 311.19, "relevance_detection_stateful": 16.78, "argument_fidelity_stateful": 60.01, "tool_selection_stateful": 53.22, "basic_2step_stateful": 24.59, "sequential_3step_stateful": 70.44, "conditional_routing_stateful": 184.44, "sequential_reasoning_stateful": 112.28, "error_recovery_stateful": 38.59, "data_gap_recovery_stateful": 152.76, "data_gap_recovery_extended_stateful": 182.1, "argument_transformation_stateful": 211.19, "grounded_synthesis_stateful": 515.36, "inconsistent_api_recovery_stateful": 315.76}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Instruct-2512-Q8_0 LS/N [reforged]", "model": "Ministral-3-8B-Instruct-2512-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "none", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 81.0, "accuracy": 81.0, "completeness": 100.0, "efficiency": 100.0, "wasted": 0.3, "speed": 4.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 30, "argument_transformation": 0, "grounded_synthesis": 4, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 68, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 15, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 120, "argument_transformation": 0, "grounded_synthesis": 20, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 272, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 200, "basic_2step": 150, "sequential_3step": 150, "conditional_routing": 150, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 215, "data_gap_recovery_extended": 75, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 350, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 200, "basic_2step_stateful": 150, "sequential_3step_stateful": 150, "conditional_routing_stateful": 150, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 207, "data_gap_recovery_extended_stateful": 165, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 350}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 50.0, "basic_2step": 50.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 28.8, "argument_fidelity": 62.04, "tool_selection": 56.63, "basic_2step": 37.9, "sequential_3step": 38.59, "conditional_routing": 146.96, "sequential_reasoning": 149.14, "error_recovery": 206.37, "data_gap_recovery": 296.38, "data_gap_recovery_extended": 459.2, "argument_transformation": 596.6, "grounded_synthesis": 307.48, "inconsistent_api_recovery": 243.07, "relevance_detection_stateful": 28.1, "argument_fidelity_stateful": 61.19, "tool_selection_stateful": 56.67, "basic_2step_stateful": 38.13, "sequential_3step_stateful": 38.66, "conditional_routing_stateful": 174.82, "sequential_reasoning_stateful": 193.85, "error_recovery_stateful": 208.37, "data_gap_recovery_stateful": 288.86, "data_gap_recovery_extended_stateful": 434.34, "argument_transformation_stateful": 580.67, "grounded_synthesis_stateful": 309.09, "inconsistent_api_recovery_stateful": 242.66}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "none", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 81.4, "accuracy": 81.6, "completeness": 99.7, "efficiency": 93.6, "wasted": 0.6, "speed": 3.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 98, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 68, "argument_transformation": 24, "grounded_synthesis": 18, "inconsistent_api_recovery": 86, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 98, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 96, "data_gap_recovery_extended_stateful": 70, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 26}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 34, "argument_transformation": 12, "grounded_synthesis": 9, "inconsistent_api_recovery": 43, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 35, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 14, "inconsistent_api_recovery_stateful": 13}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 147, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 272, "argument_transformation": 60, "grounded_synthesis": 90, "inconsistent_api_recovery": 344, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 240, "data_gap_recovery_extended_stateful": 280, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 140, "inconsistent_api_recovery_stateful": 104}, "scenarioActualCalls": {"relevance_detection": 96, "argument_fidelity": 150, "tool_selection": 156, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 230, "sequential_reasoning": 251, "error_recovery": 150, "data_gap_recovery": 223, "data_gap_recovery_extended": 196, "argument_transformation": 60, "grounded_synthesis": 61, "inconsistent_api_recovery": 513, "relevance_detection_stateful": 95, "argument_fidelity_stateful": 150, "tool_selection_stateful": 153, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 248, "sequential_reasoning_stateful": 251, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 222, "data_gap_recovery_extended_stateful": 194, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 86, "inconsistent_api_recovery_stateful": 162}, "scenarioWastedSum": {"relevance_detection": 46.0, "argument_fidelity": 0.0, "tool_selection": 9.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 40.0, "sequential_reasoning": 51.0, "error_recovery": 50.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 5.0, "argument_transformation": 17.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 180.0, "relevance_detection_stateful": 45.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 6.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 52.0, "sequential_reasoning_stateful": 51.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 5.0, "argument_transformation_stateful": 9.0, "grounded_synthesis_stateful": 5.0, "inconsistent_api_recovery_stateful": 189.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioSpeedSum": {"relevance_detection": 29.36, "argument_fidelity": 34.08, "tool_selection": 33.13, "basic_2step": 15.59, "sequential_3step": 28.21, "conditional_routing": 275.7, "sequential_reasoning": 209.31, "error_recovery": 18.24, "data_gap_recovery": 267.66, "data_gap_recovery_extended": 384.7, "argument_transformation": 466.97, "grounded_synthesis": 375.06, "inconsistent_api_recovery": 300.66, "relevance_detection_stateful": 30.77, "argument_fidelity_stateful": 35.56, "tool_selection_stateful": 30.8, "basic_2step_stateful": 17.61, "sequential_3step_stateful": 29.82, "conditional_routing_stateful": 265.95, "sequential_reasoning_stateful": 206.19, "error_recovery_stateful": 19.77, "data_gap_recovery_stateful": 276.76, "data_gap_recovery_extended_stateful": 411.89, "argument_transformation_stateful": 459.61, "grounded_synthesis_stateful": 367.47, "inconsistent_api_recovery_stateful": 306.51}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged:full]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 81.2, "accuracy": 82.0, "completeness": 98.9, "efficiency": 97.8, "wasted": 0.6, "speed": 3.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 74, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 68, "data_gap_recovery_extended": 40, "argument_transformation": 4, "grounded_synthesis": 72, "inconsistent_api_recovery": 88, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 82, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 78, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 64, "inconsistent_api_recovery_stateful": 92}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 37, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 34, "data_gap_recovery_extended": 20, "argument_transformation": 2, "grounded_synthesis": 36, "inconsistent_api_recovery": 44, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 41, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 19, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 46}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 49, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 48}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 49, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 48}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 148, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 170, "data_gap_recovery_extended": 160, "argument_transformation": 10, "grounded_synthesis": 360, "inconsistent_api_recovery": 352, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 164, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 195, "data_gap_recovery_extended_stateful": 152, "argument_transformation_stateful": 25, "grounded_synthesis_stateful": 320, "inconsistent_api_recovery_stateful": 368}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 151, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 174, "sequential_reasoning": 203, "error_recovery": 150, "data_gap_recovery": 141, "data_gap_recovery_extended": 80, "argument_transformation": 10, "grounded_synthesis": 247, "inconsistent_api_recovery": 532, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 153, "conditional_routing_stateful": 193, "sequential_reasoning_stateful": 205, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 169, "data_gap_recovery_extended_stateful": 89, "argument_transformation_stateful": 25, "grounded_synthesis_stateful": 237, "inconsistent_api_recovery_stateful": 560}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 1.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 32.0, "sequential_reasoning": 3.0, "error_recovery": 50.0, "data_gap_recovery": 11.0, "data_gap_recovery_extended": 13.0, "argument_transformation": 50.0, "grounded_synthesis": 26.0, "inconsistent_api_recovery": 185.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 3.0, "conditional_routing_stateful": 35.0, "sequential_reasoning_stateful": 5.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 13.0, "data_gap_recovery_extended_stateful": 13.0, "argument_transformation_stateful": 40.0, "grounded_synthesis_stateful": 34.0, "inconsistent_api_recovery_stateful": 200.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 49, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 48}, "scenarioSpeedSum": {"relevance_detection": 16.82, "argument_fidelity": 61.15, "tool_selection": 50.08, "basic_2step": 30.18, "sequential_3step": 63.73, "conditional_routing": 186.9, "sequential_reasoning": 86.79, "error_recovery": 39.25, "data_gap_recovery": 236.18, "data_gap_recovery_extended": 401.56, "argument_transformation": 371.33, "grounded_synthesis": 541.19, "inconsistent_api_recovery": 297.21, "relevance_detection_stateful": 17.02, "argument_fidelity_stateful": 60.63, "tool_selection_stateful": 50.8, "basic_2step_stateful": 33.73, "sequential_3step_stateful": 67.13, "conditional_routing_stateful": 181.53, "sequential_reasoning_stateful": 90.02, "error_recovery_stateful": 39.14, "data_gap_recovery_stateful": 212.99, "data_gap_recovery_extended_stateful": 433.33, "argument_transformation_stateful": 449.66, "grounded_synthesis_stateful": 511.71, "inconsistent_api_recovery_stateful": 301.11}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 49, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 48}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged:full]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 81.0, "accuracy": 83.1, "completeness": 97.5, "efficiency": 94.8, "wasted": 0.7, "speed": 3.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 98, "tool_selection": 88, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 100, "error_recovery": 98, "data_gap_recovery": 98, "data_gap_recovery_extended": 84, "argument_transformation": 24, "grounded_synthesis": 38, "inconsistent_api_recovery": 70, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 96, "data_gap_recovery_extended_stateful": 72, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 22, "inconsistent_api_recovery_stateful": 26}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 49, "data_gap_recovery_extended": 42, "argument_transformation": 12, "grounded_synthesis": 19, "inconsistent_api_recovery": 35, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 11, "inconsistent_api_recovery_stateful": 13}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 43, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 43, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 147, "tool_selection": 132, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 200, "error_recovery": 98, "data_gap_recovery": 245, "data_gap_recovery_extended": 336, "argument_transformation": 60, "grounded_synthesis": 190, "inconsistent_api_recovery": 280, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 240, "data_gap_recovery_extended_stateful": 288, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 110, "inconsistent_api_recovery_stateful": 104}, "scenarioActualCalls": {"relevance_detection": 67, "argument_fidelity": 162, "tool_selection": 174, "basic_2step": 100, "sequential_3step": 154, "conditional_routing": 221, "sequential_reasoning": 201, "error_recovery": 147, "data_gap_recovery": 234, "data_gap_recovery_extended": 268, "argument_transformation": 65, "grounded_synthesis": 160, "inconsistent_api_recovery": 386, "relevance_detection_stateful": 68, "argument_fidelity_stateful": 169, "tool_selection_stateful": 203, "basic_2step_stateful": 100, "sequential_3step_stateful": 156, "conditional_routing_stateful": 227, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 222, "data_gap_recovery_extended_stateful": 220, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 81, "inconsistent_api_recovery_stateful": 151}, "scenarioWastedSum": {"relevance_detection": 17.0, "argument_fidelity": 15.0, "tool_selection": 42.0, "basic_2step": 0.0, "sequential_3step": 4.0, "conditional_routing": 39.0, "sequential_reasoning": 1.0, "error_recovery": 50.0, "data_gap_recovery": 11.0, "data_gap_recovery_extended": 4.0, "argument_transformation": 58.0, "grounded_synthesis": 77.0, "inconsistent_api_recovery": 156.0, "relevance_detection_stateful": 18.0, "argument_fidelity_stateful": 19.0, "tool_selection_stateful": 53.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 6.0, "conditional_routing_stateful": 43.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 5.0, "data_gap_recovery_extended_stateful": 3.0, "argument_transformation_stateful": 78.0, "grounded_synthesis_stateful": 37.0, "inconsistent_api_recovery_stateful": 161.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 43, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 21.78, "argument_fidelity": 46.16, "tool_selection": 55.07, "basic_2step": 19.08, "sequential_3step": 42.7, "conditional_routing": 120.26, "sequential_reasoning": 51.14, "error_recovery": 23.54, "data_gap_recovery": 171.84, "data_gap_recovery_extended": 263.78, "argument_transformation": 266.59, "grounded_synthesis": 733.85, "inconsistent_api_recovery": 234.24, "relevance_detection_stateful": 23.1, "argument_fidelity_stateful": 48.05, "tool_selection_stateful": 59.24, "basic_2step_stateful": 21.55, "sequential_3step_stateful": 44.78, "conditional_routing_stateful": 118.26, "sequential_reasoning_stateful": 49.42, "error_recovery_stateful": 23.55, "data_gap_recovery_stateful": 157.92, "data_gap_recovery_extended_stateful": 277.44, "argument_transformation_stateful": 196.4, "grounded_synthesis_stateful": 470.0, "inconsistent_api_recovery_stateful": 224.75}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 43, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged:keep-last]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "keep-last", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 81.2, "accuracy": 84.6, "completeness": 96.0, "efficiency": 97.0, "wasted": 0.7, "speed": 4.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 94, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 100, "error_recovery": 92, "data_gap_recovery": 98, "data_gap_recovery_extended": 72, "argument_transformation": 12, "grounded_synthesis": 42, "inconsistent_api_recovery": 78, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 86, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 98, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 64, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 52, "inconsistent_api_recovery_stateful": 32}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 46, "data_gap_recovery": 49, "data_gap_recovery_extended": 36, "argument_transformation": 6, "grounded_synthesis": 21, "inconsistent_api_recovery": 39, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 16}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 46, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 46, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 141, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 200, "error_recovery": 92, "data_gap_recovery": 245, "data_gap_recovery_extended": 288, "argument_transformation": 30, "grounded_synthesis": 210, "inconsistent_api_recovery": 312, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 129, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 256, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 260, "inconsistent_api_recovery_stateful": 128}, "scenarioActualCalls": {"relevance_detection": 79, "argument_fidelity": 153, "tool_selection": 186, "basic_2step": 100, "sequential_3step": 151, "conditional_routing": 234, "sequential_reasoning": 200, "error_recovery": 138, "data_gap_recovery": 223, "data_gap_recovery_extended": 208, "argument_transformation": 35, "grounded_synthesis": 144, "inconsistent_api_recovery": 419, "relevance_detection_stateful": 82, "argument_fidelity_stateful": 155, "tool_selection_stateful": 161, "basic_2step_stateful": 100, "sequential_3step_stateful": 152, "conditional_routing_stateful": 242, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 217, "data_gap_recovery_extended_stateful": 186, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 210, "inconsistent_api_recovery_stateful": 177}, "scenarioWastedSum": {"relevance_detection": 29.0, "argument_fidelity": 3.0, "tool_selection": 45.0, "basic_2step": 0.0, "sequential_3step": 1.0, "conditional_routing": 47.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 5.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 88.0, "grounded_synthesis": 42.0, "inconsistent_api_recovery": 133.0, "relevance_detection_stateful": 32.0, "argument_fidelity_stateful": 5.0, "tool_selection_stateful": 32.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 2.0, "conditional_routing_stateful": 48.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 92.0, "grounded_synthesis_stateful": 62.0, "inconsistent_api_recovery_stateful": 136.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 46, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 38.17, "argument_fidelity": 58.48, "tool_selection": 76.74, "basic_2step": 29.17, "sequential_3step": 63.39, "conditional_routing": 179.8, "sequential_reasoning": 79.25, "error_recovery": 35.39, "data_gap_recovery": 238.49, "data_gap_recovery_extended": 438.34, "argument_transformation": 232.04, "grounded_synthesis": 765.8, "inconsistent_api_recovery": 326.58, "relevance_detection_stateful": 37.51, "argument_fidelity_stateful": 60.81, "tool_selection_stateful": 60.94, "basic_2step_stateful": 33.05, "sequential_3step_stateful": 63.23, "conditional_routing_stateful": 172.18, "sequential_reasoning_stateful": 79.0, "error_recovery_stateful": 35.61, "data_gap_recovery_stateful": 217.06, "data_gap_recovery_extended_stateful": 429.67, "argument_transformation_stateful": 254.45, "grounded_synthesis_stateful": 743.21, "inconsistent_api_recovery_stateful": 329.66}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 46, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged:full]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 81.4, "accuracy": 84.8, "completeness": 96.0, "efficiency": 95.4, "wasted": 0.7, "speed": 4.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 88, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 100, "error_recovery": 96, "data_gap_recovery": 100, "data_gap_recovery_extended": 72, "argument_transformation": 12, "grounded_synthesis": 58, "inconsistent_api_recovery": 80, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 98, "tool_selection_stateful": 88, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 92, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 64, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 36, "inconsistent_api_recovery_stateful": 40}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 48, "data_gap_recovery": 50, "data_gap_recovery_extended": 36, "argument_transformation": 6, "grounded_synthesis": 29, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 46, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 18, "inconsistent_api_recovery_stateful": 20}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 35, "grounded_synthesis": 46, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 36, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 35, "grounded_synthesis": 46, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 36, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 132, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 200, "error_recovery": 96, "data_gap_recovery": 250, "data_gap_recovery_extended": 288, "argument_transformation": 30, "grounded_synthesis": 290, "inconsistent_api_recovery": 320, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 132, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 138, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 256, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 180, "inconsistent_api_recovery_stateful": 160}, "scenarioActualCalls": {"relevance_detection": 82, "argument_fidelity": 159, "tool_selection": 153, "basic_2step": 100, "sequential_3step": 152, "conditional_routing": 236, "sequential_reasoning": 201, "error_recovery": 144, "data_gap_recovery": 217, "data_gap_recovery_extended": 222, "argument_transformation": 32, "grounded_synthesis": 239, "inconsistent_api_recovery": 445, "relevance_detection_stateful": 76, "argument_fidelity_stateful": 154, "tool_selection_stateful": 165, "basic_2step_stateful": 100, "sequential_3step_stateful": 151, "conditional_routing_stateful": 236, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 138, "data_gap_recovery_stateful": 228, "data_gap_recovery_extended_stateful": 196, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 136, "inconsistent_api_recovery_stateful": 236}, "scenarioWastedSum": {"relevance_detection": 32.0, "argument_fidelity": 9.0, "tool_selection": 21.0, "basic_2step": 0.0, "sequential_3step": 2.0, "conditional_routing": 49.0, "sequential_reasoning": 1.0, "error_recovery": 50.0, "data_gap_recovery": 3.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 83.0, "grounded_synthesis": 58.0, "inconsistent_api_recovery": 155.0, "relevance_detection_stateful": 26.0, "argument_fidelity_stateful": 7.0, "tool_selection_stateful": 33.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 1.0, "conditional_routing_stateful": 46.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 5.0, "data_gap_recovery_extended_stateful": 2.0, "argument_transformation_stateful": 81.0, "grounded_synthesis_stateful": 28.0, "inconsistent_api_recovery_stateful": 168.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 35, "grounded_synthesis": 46, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 36, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 39.76, "argument_fidelity": 63.93, "tool_selection": 60.49, "basic_2step": 30.06, "sequential_3step": 64.48, "conditional_routing": 187.49, "sequential_reasoning": 83.25, "error_recovery": 36.47, "data_gap_recovery": 208.49, "data_gap_recovery_extended": 446.2, "argument_transformation": 246.56, "grounded_synthesis": 600.81, "inconsistent_api_recovery": 334.35, "relevance_detection_stateful": 37.06, "argument_fidelity_stateful": 63.28, "tool_selection_stateful": 64.73, "basic_2step_stateful": 33.96, "sequential_3step_stateful": 63.21, "conditional_routing_stateful": 173.54, "sequential_reasoning_stateful": 81.71, "error_recovery_stateful": 36.52, "data_gap_recovery_stateful": 228.69, "data_gap_recovery_extended_stateful": 448.95, "argument_transformation_stateful": 284.39, "grounded_synthesis_stateful": 792.01, "inconsistent_api_recovery_stateful": 336.98}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 35, "grounded_synthesis": 46, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 36, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "none", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 80.9, "accuracy": 84.8, "completeness": 95.4, "efficiency": 95.2, "wasted": 0.7, "speed": 3.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 90, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 98, "sequential_reasoning": 100, "error_recovery": 98, "data_gap_recovery": 100, "data_gap_recovery_extended": 72, "argument_transformation": 6, "grounded_synthesis": 50, "inconsistent_api_recovery": 76, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 86, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 92, "data_gap_recovery_stateful": 96, "data_gap_recovery_extended_stateful": 64, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 32}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 45, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 36, "argument_transformation": 3, "grounded_synthesis": 25, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 46, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 21, "inconsistent_api_recovery_stateful": 16}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 45, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 35, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 45, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 35, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 135, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 196, "sequential_reasoning": 200, "error_recovery": 98, "data_gap_recovery": 250, "data_gap_recovery_extended": 288, "argument_transformation": 15, "grounded_synthesis": 250, "inconsistent_api_recovery": 304, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 129, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 138, "data_gap_recovery_stateful": 240, "data_gap_recovery_extended_stateful": 256, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 210, "inconsistent_api_recovery_stateful": 128}, "scenarioActualCalls": {"relevance_detection": 77, "argument_fidelity": 156, "tool_selection": 181, "basic_2step": 100, "sequential_3step": 151, "conditional_routing": 237, "sequential_reasoning": 200, "error_recovery": 147, "data_gap_recovery": 222, "data_gap_recovery_extended": 218, "argument_transformation": 27, "grounded_synthesis": 182, "inconsistent_api_recovery": 419, "relevance_detection_stateful": 78, "argument_fidelity_stateful": 156, "tool_selection_stateful": 169, "basic_2step_stateful": 100, "sequential_3step_stateful": 151, "conditional_routing_stateful": 246, "sequential_reasoning_stateful": 201, "error_recovery_stateful": 138, "data_gap_recovery_stateful": 217, "data_gap_recovery_extended_stateful": 203, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 177, "inconsistent_api_recovery_stateful": 186}, "scenarioWastedSum": {"relevance_detection": 27.0, "argument_fidelity": 6.0, "tool_selection": 46.0, "basic_2step": 0.0, "sequential_3step": 1.0, "conditional_routing": 45.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 5.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 72.0, "grounded_synthesis": 39.0, "inconsistent_api_recovery": 150.0, "relevance_detection_stateful": 28.0, "argument_fidelity_stateful": 6.0, "tool_selection_stateful": 40.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 1.0, "conditional_routing_stateful": 49.0, "sequential_reasoning_stateful": 1.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 5.0, "data_gap_recovery_extended_stateful": 2.0, "argument_transformation_stateful": 52.0, "grounded_synthesis_stateful": 43.0, "inconsistent_api_recovery_stateful": 144.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 45, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 35, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 36.77, "argument_fidelity": 61.69, "tool_selection": 70.55, "basic_2step": 29.05, "sequential_3step": 63.86, "conditional_routing": 168.47, "sequential_reasoning": 79.86, "error_recovery": 35.56, "data_gap_recovery": 200.64, "data_gap_recovery_extended": 423.81, "argument_transformation": 204.05, "grounded_synthesis": 620.01, "inconsistent_api_recovery": 332.75, "relevance_detection_stateful": 35.61, "argument_fidelity_stateful": 61.71, "tool_selection_stateful": 60.54, "basic_2step_stateful": 32.55, "sequential_3step_stateful": 66.83, "conditional_routing_stateful": 175.3, "sequential_reasoning_stateful": 79.8, "error_recovery_stateful": 35.48, "data_gap_recovery_stateful": 220.09, "data_gap_recovery_extended_stateful": 409.22, "argument_transformation_stateful": 248.32, "grounded_synthesis_stateful": 734.14, "inconsistent_api_recovery_stateful": 343.48}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 45, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 35, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-sonnet-4-6 AN/N [bare+any]", "model": "claude-sonnet-4-6", "backend": "anthropic", "mode": "native", "ablation": "bare", "replay": "none", "family": "claude", "quant": "n/a", "gen": 3, "retired": false, "score": 81.2, "accuracy": 87.9, "completeness": 92.3, "efficiency": 100.0, "wasted": 0.0, "speed": 9.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 100, "argument_transformation": 6, "grounded_synthesis": 100, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 100, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 3, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 400, "argument_transformation": 15, "grounded_synthesis": 500, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 400, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 100, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 194, "argument_transformation": 12, "grounded_synthesis": 250, "inconsistent_api_recovery": 200, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 190, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 250, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 147.25, "argument_fidelity": 244.2, "tool_selection": 284.04, "basic_2step": 159.06, "sequential_3step": 236.73, "conditional_routing": 519.06, "sequential_reasoning": 394.52, "error_recovery": 0.0, "data_gap_recovery": 548.15, "data_gap_recovery_extended": 745.2, "argument_transformation": 558.15, "grounded_synthesis": 1386.56, "inconsistent_api_recovery": 622.13, "relevance_detection_stateful": 127.8, "argument_fidelity_stateful": 264.83, "tool_selection_stateful": 307.0, "basic_2step_stateful": 180.52, "sequential_3step_stateful": 234.71, "conditional_routing_stateful": 521.72, "sequential_reasoning_stateful": 488.53, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 568.68, "data_gap_recovery_extended_stateful": 732.33, "argument_transformation_stateful": 551.39, "grounded_synthesis_stateful": 1297.34, "inconsistent_api_recovery_stateful": 685.01}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/N [reforged]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "none", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 79.7, "accuracy": 79.9, "completeness": 99.8, "efficiency": 100.0, "wasted": 0.3, "speed": 8.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 98, "error_recovery": 100, "data_gap_recovery": 84, "data_gap_recovery_extended": 8, "argument_transformation": 30, "grounded_synthesis": 64, "inconsistent_api_recovery": 92, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 88, "sequential_reasoning_stateful": 94, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 86, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 84}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 42, "data_gap_recovery_extended": 4, "argument_transformation": 15, "grounded_synthesis": 32, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 42}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 196, "error_recovery": 100, "data_gap_recovery": 210, "data_gap_recovery_extended": 32, "argument_transformation": 75, "grounded_synthesis": 320, "inconsistent_api_recovery": 368, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 176, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 215, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 240, "inconsistent_api_recovery_stateful": 336}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 232, "sequential_reasoning": 196, "error_recovery": 155, "data_gap_recovery": 205, "data_gap_recovery_extended": 35, "argument_transformation": 60, "grounded_synthesis": 332, "inconsistent_api_recovery": 331, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 218, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 152, "data_gap_recovery_stateful": 199, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 180, "inconsistent_api_recovery_stateful": 298}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 44.0, "sequential_reasoning": 0.0, "error_recovery": 55.0, "data_gap_recovery": 16.0, "data_gap_recovery_extended": 5.0, "argument_transformation": 0.0, "grounded_synthesis": 131.0, "inconsistent_api_recovery": 15.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 42.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 2.0, "data_gap_recovery_stateful": 11.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 10.0, "grounded_synthesis_stateful": 68.0, "inconsistent_api_recovery_stateful": 17.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioSpeedSum": {"relevance_detection": 90.61, "argument_fidelity": 91.16, "tool_selection": 90.76, "basic_2step": 49.52, "sequential_3step": 94.37, "conditional_routing": 352.94, "sequential_reasoning": 184.38, "error_recovery": 143.82, "data_gap_recovery": 499.44, "data_gap_recovery_extended": 603.23, "argument_transformation": 1123.39, "grounded_synthesis": 766.53, "inconsistent_api_recovery": 1307.87, "relevance_detection_stateful": 89.83, "argument_fidelity_stateful": 94.34, "tool_selection_stateful": 93.81, "basic_2step_stateful": 46.88, "sequential_3step_stateful": 102.14, "conditional_routing_stateful": 316.81, "sequential_reasoning_stateful": 173.08, "error_recovery_stateful": 145.06, "data_gap_recovery_stateful": 473.16, "data_gap_recovery_extended_stateful": 514.62, "argument_transformation_stateful": 1038.82, "grounded_synthesis_stateful": 752.35, "inconsistent_api_recovery_stateful": 1277.64}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged:keep-last]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "keep-last", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 79.8, "accuracy": 82.6, "completeness": 96.7, "efficiency": 94.6, "wasted": 0.6, "speed": 2.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 96, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 98, "sequential_reasoning": 98, "error_recovery": 94, "data_gap_recovery": 98, "data_gap_recovery_extended": 66, "argument_transformation": 8, "grounded_synthesis": 36, "inconsistent_api_recovery": 64, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 96, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 96, "data_gap_recovery_stateful": 94, "data_gap_recovery_extended_stateful": 76, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 24}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 49, "error_recovery": 47, "data_gap_recovery": 49, "data_gap_recovery_extended": 33, "argument_transformation": 4, "grounded_synthesis": 18, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 17, "inconsistent_api_recovery_stateful": 12}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 144, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 196, "sequential_reasoning": 196, "error_recovery": 94, "data_gap_recovery": 245, "data_gap_recovery_extended": 264, "argument_transformation": 20, "grounded_synthesis": 180, "inconsistent_api_recovery": 256, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 144, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 304, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 170, "inconsistent_api_recovery_stateful": 96}, "scenarioActualCalls": {"relevance_detection": 69, "argument_fidelity": 162, "tool_selection": 190, "basic_2step": 100, "sequential_3step": 153, "conditional_routing": 239, "sequential_reasoning": 196, "error_recovery": 141, "data_gap_recovery": 232, "data_gap_recovery_extended": 213, "argument_transformation": 18, "grounded_synthesis": 135, "inconsistent_api_recovery": 357, "relevance_detection_stateful": 71, "argument_fidelity_stateful": 166, "tool_selection_stateful": 191, "basic_2step_stateful": 100, "sequential_3step_stateful": 155, "conditional_routing_stateful": 241, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 216, "data_gap_recovery_extended_stateful": 241, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 156, "inconsistent_api_recovery_stateful": 131}, "scenarioWastedSum": {"relevance_detection": 19.0, "argument_fidelity": 12.0, "tool_selection": 46.0, "basic_2step": 0.0, "sequential_3step": 3.0, "conditional_routing": 47.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 9.0, "data_gap_recovery_extended": 3.0, "argument_transformation": 13.0, "grounded_synthesis": 76.0, "inconsistent_api_recovery": 162.0, "relevance_detection_stateful": 21.0, "argument_fidelity_stateful": 16.0, "tool_selection_stateful": 47.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 5.0, "conditional_routing_stateful": 46.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 9.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 13.0, "grounded_synthesis_stateful": 41.0, "inconsistent_api_recovery_stateful": 138.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 23.15, "argument_fidelity": 47.24, "tool_selection": 59.44, "basic_2step": 19.6, "sequential_3step": 42.97, "conditional_routing": 119.97, "sequential_reasoning": 51.84, "error_recovery": 24.06, "data_gap_recovery": 174.37, "data_gap_recovery_extended": 278.95, "argument_transformation": 201.91, "grounded_synthesis": 482.32, "inconsistent_api_recovery": 243.47, "relevance_detection_stateful": 24.4, "argument_fidelity_stateful": 47.27, "tool_selection_stateful": 60.62, "basic_2step_stateful": 22.05, "sequential_3step_stateful": 47.27, "conditional_routing_stateful": 122.23, "sequential_reasoning_stateful": 50.41, "error_recovery_stateful": 24.23, "data_gap_recovery_stateful": 172.91, "data_gap_recovery_extended_stateful": 276.52, "argument_transformation_stateful": 165.58, "grounded_synthesis_stateful": 458.16, "inconsistent_api_recovery_stateful": 237.28}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}}, {"label": "qwen3:14b-q4_K_M OL/N [reforged:full]", "model": "qwen3:14b-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 78.6, "accuracy": 78.7, "completeness": 99.9, "efficiency": 76.7, "wasted": 1.2, "speed": 38.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 74, "data_gap_recovery_extended": 4, "argument_transformation": 12, "grounded_synthesis": 68, "inconsistent_api_recovery": 78, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 94, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 54, "inconsistent_api_recovery_stateful": 68}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 37, "data_gap_recovery_extended": 2, "argument_transformation": 6, "grounded_synthesis": 34, "inconsistent_api_recovery": 39, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 34}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 185, "data_gap_recovery_extended": 16, "argument_transformation": 30, "grounded_synthesis": 340, "inconsistent_api_recovery": 312, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 141, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 270, "inconsistent_api_recovery_stateful": 272}, "scenarioActualCalls": {"relevance_detection": 60, "argument_fidelity": 155, "tool_selection": 198, "basic_2step": 203, "sequential_3step": 153, "conditional_routing": 268, "sequential_reasoning": 200, "error_recovery": 158, "data_gap_recovery": 182, "data_gap_recovery_extended": 14, "argument_transformation": 24, "grounded_synthesis": 560, "inconsistent_api_recovery": 464, "relevance_detection_stateful": 57, "argument_fidelity_stateful": 152, "tool_selection_stateful": 200, "basic_2step_stateful": 153, "sequential_3step_stateful": 154, "conditional_routing_stateful": 270, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 152, "data_gap_recovery_stateful": 219, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 460, "inconsistent_api_recovery_stateful": 415}, "scenarioWastedSum": {"relevance_detection": 10.0, "argument_fidelity": 5.0, "tool_selection": 48.0, "basic_2step": 103.0, "sequential_3step": 3.0, "conditional_routing": 68.0, "sequential_reasoning": 0.0, "error_recovery": 58.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 2.0, "grounded_synthesis": 317.0, "inconsistent_api_recovery": 193.0, "relevance_detection_stateful": 7.0, "argument_fidelity_stateful": 2.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 53.0, "sequential_3step_stateful": 4.0, "conditional_routing_stateful": 70.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 14.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 312.0, "inconsistent_api_recovery_stateful": 189.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 132.08, "argument_fidelity": 415.71, "tool_selection": 591.32, "basic_2step": 944.65, "sequential_3step": 806.26, "conditional_routing": 1491.25, "sequential_reasoning": 783.04, "error_recovery": 650.24, "data_gap_recovery": 1061.72, "data_gap_recovery_extended": 1733.48, "argument_transformation": 3533.73, "grounded_synthesis": 7452.35, "inconsistent_api_recovery": 5646.86, "relevance_detection_stateful": 120.01, "argument_fidelity_stateful": 406.89, "tool_selection_stateful": 647.89, "basic_2step_stateful": 634.11, "sequential_3step_stateful": 821.99, "conditional_routing_stateful": 1621.36, "sequential_reasoning_stateful": 874.04, "error_recovery_stateful": 663.65, "data_gap_recovery_stateful": 1232.61, "data_gap_recovery_extended_stateful": 1850.08, "argument_transformation_stateful": 3127.15, "grounded_synthesis_stateful": 7415.91, "inconsistent_api_recovery_stateful": 5416.09}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/N [reforged:keep-last]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "keep-last", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 78.7, "accuracy": 81.4, "completeness": 96.7, "efficiency": 99.2, "wasted": 0.5, "speed": 9.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 92, "error_recovery": 96, "data_gap_recovery": 96, "data_gap_recovery_extended": 6, "argument_transformation": 20, "grounded_synthesis": 64, "inconsistent_api_recovery": 62, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 88, "sequential_reasoning_stateful": 88, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 82, "inconsistent_api_recovery_stateful": 56}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 46, "error_recovery": 48, "data_gap_recovery": 48, "data_gap_recovery_extended": 3, "argument_transformation": 10, "grounded_synthesis": 32, "inconsistent_api_recovery": 31, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 44, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 28}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 33, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 30}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 33, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 30}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 184, "error_recovery": 96, "data_gap_recovery": 240, "data_gap_recovery_extended": 24, "argument_transformation": 50, "grounded_synthesis": 320, "inconsistent_api_recovery": 248, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 176, "sequential_reasoning_stateful": 176, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 410, "inconsistent_api_recovery_stateful": 224}, "scenarioActualCalls": {"relevance_detection": 51, "argument_fidelity": 159, "tool_selection": 174, "basic_2step": 102, "sequential_3step": 158, "conditional_routing": 201, "sequential_reasoning": 194, "error_recovery": 188, "data_gap_recovery": 221, "data_gap_recovery_extended": 14, "argument_transformation": 35, "grounded_synthesis": 135, "inconsistent_api_recovery": 353, "relevance_detection_stateful": 51, "argument_fidelity_stateful": 158, "tool_selection_stateful": 174, "basic_2step_stateful": 101, "sequential_3step_stateful": 159, "conditional_routing_stateful": 195, "sequential_reasoning_stateful": 183, "error_recovery_stateful": 208, "data_gap_recovery_stateful": 215, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 181, "inconsistent_api_recovery_stateful": 356}, "scenarioWastedSum": {"relevance_detection": 1.0, "argument_fidelity": 9.0, "tool_selection": 24.0, "basic_2step": 2.0, "sequential_3step": 8.0, "conditional_routing": 26.0, "sequential_reasoning": 10.0, "error_recovery": 94.0, "data_gap_recovery": 14.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 1.0, "grounded_synthesis": 16.0, "inconsistent_api_recovery": 112.0, "relevance_detection_stateful": 1.0, "argument_fidelity_stateful": 8.0, "tool_selection_stateful": 24.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 9.0, "conditional_routing_stateful": 27.0, "sequential_reasoning_stateful": 9.0, "error_recovery_stateful": 58.0, "data_gap_recovery_stateful": 9.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 18.0, "inconsistent_api_recovery_stateful": 138.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 33, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 30}, "scenarioSpeedSum": {"relevance_detection": 87.42, "argument_fidelity": 136.2, "tool_selection": 179.73, "basic_2step": 74.62, "sequential_3step": 164.81, "conditional_routing": 470.36, "sequential_reasoning": 298.75, "error_recovery": 263.25, "data_gap_recovery": 692.12, "data_gap_recovery_extended": 648.57, "argument_transformation": 860.37, "grounded_synthesis": 792.14, "inconsistent_api_recovery": 1098.67, "relevance_detection_stateful": 89.05, "argument_fidelity_stateful": 159.47, "tool_selection_stateful": 168.67, "basic_2step_stateful": 79.31, "sequential_3step_stateful": 197.14, "conditional_routing_stateful": 492.99, "sequential_reasoning_stateful": 322.12, "error_recovery_stateful": 281.08, "data_gap_recovery_stateful": 663.22, "data_gap_recovery_extended_stateful": 717.96, "argument_transformation_stateful": 889.27, "grounded_synthesis_stateful": 821.27, "inconsistent_api_recovery_stateful": 1029.3}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 33, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 30}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "none", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 79.5, "accuracy": 81.9, "completeness": 97.0, "efficiency": 93.6, "wasted": 0.7, "speed": 2.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 98, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 100, "error_recovery": 90, "data_gap_recovery": 96, "data_gap_recovery_extended": 68, "argument_transformation": 4, "grounded_synthesis": 32, "inconsistent_api_recovery": 68, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 98, "tool_selection_stateful": 98, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 96, "error_recovery_stateful": 94, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 70, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 30}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 45, "data_gap_recovery": 48, "data_gap_recovery_extended": 34, "argument_transformation": 2, "grounded_synthesis": 16, "inconsistent_api_recovery": 34, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 35, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 17, "inconsistent_api_recovery_stateful": 15}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 36, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 36, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 147, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 200, "error_recovery": 90, "data_gap_recovery": 240, "data_gap_recovery_extended": 272, "argument_transformation": 10, "grounded_synthesis": 160, "inconsistent_api_recovery": 272, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 141, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 280, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 170, "inconsistent_api_recovery_stateful": 120}, "scenarioActualCalls": {"relevance_detection": 72, "argument_fidelity": 174, "tool_selection": 190, "basic_2step": 100, "sequential_3step": 161, "conditional_routing": 217, "sequential_reasoning": 200, "error_recovery": 135, "data_gap_recovery": 231, "data_gap_recovery_extended": 212, "argument_transformation": 8, "grounded_synthesis": 126, "inconsistent_api_recovery": 401, "relevance_detection_stateful": 66, "argument_fidelity_stateful": 163, "tool_selection_stateful": 193, "basic_2step_stateful": 100, "sequential_3step_stateful": 158, "conditional_routing_stateful": 225, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 141, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 222, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 139, "inconsistent_api_recovery_stateful": 185}, "scenarioWastedSum": {"relevance_detection": 22.0, "argument_fidelity": 24.0, "tool_selection": 43.0, "basic_2step": 0.0, "sequential_3step": 11.0, "conditional_routing": 35.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 11.0, "data_gap_recovery_extended": 5.0, "argument_transformation": 33.0, "grounded_synthesis": 80.0, "inconsistent_api_recovery": 165.0, "relevance_detection_stateful": 16.0, "argument_fidelity_stateful": 16.0, "tool_selection_stateful": 46.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 8.0, "conditional_routing_stateful": 41.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 7.0, "data_gap_recovery_extended_stateful": 2.0, "argument_transformation_stateful": 58.0, "grounded_synthesis_stateful": 53.0, "inconsistent_api_recovery_stateful": 171.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 36, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 22.07, "argument_fidelity": 52.05, "tool_selection": 57.5, "basic_2step": 19.6, "sequential_3step": 48.88, "conditional_routing": 111.16, "sequential_reasoning": 51.14, "error_recovery": 24.32, "data_gap_recovery": 163.97, "data_gap_recovery_extended": 287.52, "argument_transformation": 167.54, "grounded_synthesis": 457.57, "inconsistent_api_recovery": 234.05, "relevance_detection_stateful": 22.32, "argument_fidelity_stateful": 46.12, "tool_selection_stateful": 60.64, "basic_2step_stateful": 22.07, "sequential_3step_stateful": 48.84, "conditional_routing_stateful": 122.08, "sequential_reasoning_stateful": 51.35, "error_recovery_stateful": 24.13, "data_gap_recovery_stateful": 170.98, "data_gap_recovery_extended_stateful": 284.75, "argument_transformation_stateful": 191.64, "grounded_synthesis_stateful": 514.6, "inconsistent_api_recovery_stateful": 229.11}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 36, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/N [reforged:full]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 79.2, "accuracy": 82.2, "completeness": 96.3, "efficiency": 99.1, "wasted": 0.5, "speed": 10.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 88, "error_recovery": 94, "data_gap_recovery": 100, "data_gap_recovery_extended": 2, "argument_transformation": 40, "grounded_synthesis": 78, "inconsistent_api_recovery": 54, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 94, "sequential_reasoning_stateful": 84, "error_recovery_stateful": 98, "data_gap_recovery_stateful": 96, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 82, "inconsistent_api_recovery_stateful": 52}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 44, "error_recovery": 47, "data_gap_recovery": 50, "data_gap_recovery_extended": 1, "argument_transformation": 20, "grounded_synthesis": 39, "inconsistent_api_recovery": 27, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 42, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 26}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 47, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 45, "inconsistent_api_recovery": 33, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 32}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 47, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 45, "inconsistent_api_recovery": 33, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 32}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 176, "error_recovery": 94, "data_gap_recovery": 250, "data_gap_recovery_extended": 8, "argument_transformation": 100, "grounded_synthesis": 390, "inconsistent_api_recovery": 216, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 188, "sequential_reasoning_stateful": 168, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 240, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 410, "inconsistent_api_recovery_stateful": 208}, "scenarioActualCalls": {"relevance_detection": 51, "argument_fidelity": 155, "tool_selection": 174, "basic_2step": 100, "sequential_3step": 155, "conditional_routing": 214, "sequential_reasoning": 188, "error_recovery": 180, "data_gap_recovery": 233, "data_gap_recovery_extended": 6, "argument_transformation": 70, "grounded_synthesis": 230, "inconsistent_api_recovery": 312, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 153, "tool_selection_stateful": 173, "basic_2step_stateful": 100, "sequential_3step_stateful": 158, "conditional_routing_stateful": 206, "sequential_reasoning_stateful": 179, "error_recovery_stateful": 201, "data_gap_recovery_stateful": 224, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 211, "inconsistent_api_recovery_stateful": 301}, "scenarioWastedSum": {"relevance_detection": 1.0, "argument_fidelity": 5.0, "tool_selection": 24.0, "basic_2step": 0.0, "sequential_3step": 5.0, "conditional_routing": 32.0, "sequential_reasoning": 15.0, "error_recovery": 86.0, "data_gap_recovery": 13.0, "data_gap_recovery_extended": 3.0, "argument_transformation": 0.0, "grounded_synthesis": 39.0, "inconsistent_api_recovery": 124.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 3.0, "tool_selection_stateful": 23.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 8.0, "conditional_routing_stateful": 30.0, "sequential_reasoning_stateful": 14.0, "error_recovery_stateful": 54.0, "data_gap_recovery_stateful": 15.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 3.0, "grounded_synthesis_stateful": 27.0, "inconsistent_api_recovery_stateful": 105.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 47, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 45, "inconsistent_api_recovery": 33, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 32}, "scenarioSpeedSum": {"relevance_detection": 90.94, "argument_fidelity": 147.45, "tool_selection": 188.67, "basic_2step": 73.15, "sequential_3step": 174.84, "conditional_routing": 513.06, "sequential_reasoning": 321.46, "error_recovery": 262.78, "data_gap_recovery": 749.47, "data_gap_recovery_extended": 747.67, "argument_transformation": 954.71, "grounded_synthesis": 866.25, "inconsistent_api_recovery": 1191.81, "relevance_detection_stateful": 89.62, "argument_fidelity_stateful": 162.46, "tool_selection_stateful": 192.49, "basic_2step_stateful": 78.7, "sequential_3step_stateful": 174.06, "conditional_routing_stateful": 498.08, "sequential_reasoning_stateful": 317.09, "error_recovery_stateful": 279.75, "data_gap_recovery_stateful": 750.51, "data_gap_recovery_extended_stateful": 716.52, "argument_transformation_stateful": 1008.37, "grounded_synthesis_stateful": 850.16, "inconsistent_api_recovery_stateful": 1134.22}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 47, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 45, "inconsistent_api_recovery": 33, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 32}}, {"label": "gemma-4-E4B-it-Q8_0 LS/N [reforged]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "none", "family": "gemma4-e4b", "quant": "q8_0", "gen": 3, "retired": false, "score": 77.8, "accuracy": 77.9, "completeness": 99.8, "efficiency": 100.0, "wasted": 0.2, "speed": 10.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 76, "sequential_reasoning": 90, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 18, "grounded_synthesis": 38, "inconsistent_api_recovery": 98, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 76, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 36, "inconsistent_api_recovery_stateful": 94}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 38, "sequential_reasoning": 45, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 9, "grounded_synthesis": 19, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 18, "inconsistent_api_recovery_stateful": 47}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 152, "sequential_reasoning": 180, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 45, "grounded_synthesis": 190, "inconsistent_api_recovery": 392, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 152, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 180, "inconsistent_api_recovery_stateful": 376}, "scenarioActualCalls": {"relevance_detection": 52, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 181, "sequential_reasoning": 180, "error_recovery": 155, "data_gap_recovery": 226, "data_gap_recovery_extended": 0, "argument_transformation": 37, "grounded_synthesis": 147, "inconsistent_api_recovery": 343, "relevance_detection_stateful": 51, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 185, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 154, "data_gap_recovery_stateful": 219, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 137, "inconsistent_api_recovery_stateful": 306}, "scenarioWastedSum": {"relevance_detection": 2.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 29.0, "sequential_reasoning": 0.0, "error_recovery": 55.0, "data_gap_recovery": 6.0, "data_gap_recovery_extended": 2.0, "argument_transformation": 0.0, "grounded_synthesis": 40.0, "inconsistent_api_recovery": 7.0, "relevance_detection_stateful": 1.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 33.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 4.0, "data_gap_recovery_stateful": 6.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 40.0, "inconsistent_api_recovery_stateful": 5.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioSpeedSum": {"relevance_detection": 117.04, "argument_fidelity": 112.0, "tool_selection": 109.77, "basic_2step": 56.75, "sequential_3step": 131.64, "conditional_routing": 505.61, "sequential_reasoning": 220.15, "error_recovery": 195.75, "data_gap_recovery": 682.6, "data_gap_recovery_extended": 842.34, "argument_transformation": 1276.87, "grounded_synthesis": 1021.03, "inconsistent_api_recovery": 1725.46, "relevance_detection_stateful": 114.31, "argument_fidelity_stateful": 118.53, "tool_selection_stateful": 121.32, "basic_2step_stateful": 64.85, "sequential_3step_stateful": 134.61, "conditional_routing_stateful": 507.45, "sequential_reasoning_stateful": 223.97, "error_recovery_stateful": 190.33, "data_gap_recovery_stateful": 659.36, "data_gap_recovery_extended_stateful": 811.7, "argument_transformation_stateful": 1377.98, "grounded_synthesis_stateful": 1024.31, "inconsistent_api_recovery_stateful": 1650.22}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}}, {"label": "Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [reforged]", "model": "Ministral-3-14B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "none", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 77.8, "accuracy": 77.8, "completeness": 100.0, "efficiency": 96.9, "wasted": 0.3, "speed": 4.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 4, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 32, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 56, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 200, "basic_2step": 150, "sequential_3step": 150, "conditional_routing": 150, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 200, "data_gap_recovery_extended": 20, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 350, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 200, "basic_2step_stateful": 150, "sequential_3step_stateful": 150, "conditional_routing_stateful": 150, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 200, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 350}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 50.0, "basic_2step": 50.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 29.46, "argument_fidelity": 52.23, "tool_selection": 48.61, "basic_2step": 23.58, "sequential_3step": 37.06, "conditional_routing": 159.75, "sequential_reasoning": 189.96, "error_recovery": 25.07, "data_gap_recovery": 304.52, "data_gap_recovery_extended": 335.22, "argument_transformation": 380.57, "grounded_synthesis": 770.98, "inconsistent_api_recovery": 231.07, "relevance_detection_stateful": 28.83, "argument_fidelity_stateful": 52.72, "tool_selection_stateful": 48.21, "basic_2step_stateful": 23.61, "sequential_3step_stateful": 36.75, "conditional_routing_stateful": 165.18, "sequential_reasoning_stateful": 181.67, "error_recovery_stateful": 25.07, "data_gap_recovery_stateful": 319.04, "data_gap_recovery_extended_stateful": 353.38, "argument_transformation_stateful": 376.13, "grounded_synthesis_stateful": 769.19, "inconsistent_api_recovery_stateful": 230.76}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [reforged]", "model": "Ministral-3-8B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "none", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 78.3, "accuracy": 78.3, "completeness": 100.0, "efficiency": 95.0, "wasted": 0.4, "speed": 3.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 18, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 9, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 72, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 64, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 200, "basic_2step": 150, "sequential_3step": 150, "conditional_routing": 150, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 244, "data_gap_recovery_extended": 45, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 350, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 200, "basic_2step_stateful": 150, "sequential_3step_stateful": 150, "conditional_routing_stateful": 150, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 350}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 50.0, "basic_2step": 50.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 8.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 8.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 16.76, "argument_fidelity": 35.5, "tool_selection": 36.14, "basic_2step": 19.55, "sequential_3step": 61.67, "conditional_routing": 100.9, "sequential_reasoning": 91.78, "error_recovery": 186.84, "data_gap_recovery": 202.99, "data_gap_recovery_extended": 281.15, "argument_transformation": 437.08, "grounded_synthesis": 440.75, "inconsistent_api_recovery": 155.44, "relevance_detection_stateful": 16.8, "argument_fidelity_stateful": 35.61, "tool_selection_stateful": 36.43, "basic_2step_stateful": 19.56, "sequential_3step_stateful": 64.37, "conditional_routing_stateful": 102.51, "sequential_reasoning_stateful": 103.01, "error_recovery_stateful": 185.29, "data_gap_recovery_stateful": 197.54, "data_gap_recovery_extended_stateful": 285.46, "argument_transformation_stateful": 446.61, "grounded_synthesis_stateful": 449.67, "inconsistent_api_recovery_stateful": 155.24}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "none", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 77.7, "accuracy": 78.8, "completeness": 98.5, "efficiency": 94.6, "wasted": 0.6, "speed": 3.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 74, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 78, "data_gap_recovery_extended": 32, "argument_transformation": 2, "grounded_synthesis": 46, "inconsistent_api_recovery": 90, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 66, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 74, "data_gap_recovery_extended_stateful": 28, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 78}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 37, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 39, "data_gap_recovery_extended": 16, "argument_transformation": 1, "grounded_synthesis": 23, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 33, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 39}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 44}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 44}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 148, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 195, "data_gap_recovery_extended": 128, "argument_transformation": 5, "grounded_synthesis": 230, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 132, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 185, "data_gap_recovery_extended_stateful": 112, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 240, "inconsistent_api_recovery_stateful": 312}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 152, "basic_2step": 100, "sequential_3step": 154, "conditional_routing": 170, "sequential_reasoning": 204, "error_recovery": 150, "data_gap_recovery": 169, "data_gap_recovery_extended": 73, "argument_transformation": 4, "grounded_synthesis": 202, "inconsistent_api_recovery": 556, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 153, "conditional_routing_stateful": 155, "sequential_reasoning_stateful": 203, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 152, "data_gap_recovery_extended_stateful": 65, "argument_transformation_stateful": 9, "grounded_synthesis_stateful": 188, "inconsistent_api_recovery_stateful": 469}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 2.0, "basic_2step": 0.0, "sequential_3step": 4.0, "conditional_routing": 28.0, "sequential_reasoning": 4.0, "error_recovery": 50.0, "data_gap_recovery": 19.0, "data_gap_recovery_extended": 9.0, "argument_transformation": 49.0, "grounded_synthesis": 46.0, "inconsistent_api_recovery": 204.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 3.0, "conditional_routing_stateful": 27.0, "sequential_reasoning_stateful": 3.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 12.0, "data_gap_recovery_extended_stateful": 14.0, "argument_transformation_stateful": 22.0, "grounded_synthesis_stateful": 37.0, "inconsistent_api_recovery_stateful": 186.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 44}, "scenarioSpeedSum": {"relevance_detection": 17.01, "argument_fidelity": 59.87, "tool_selection": 50.95, "basic_2step": 29.9, "sequential_3step": 89.75, "conditional_routing": 182.8, "sequential_reasoning": 95.82, "error_recovery": 38.98, "data_gap_recovery": 222.16, "data_gap_recovery_extended": 424.5, "argument_transformation": 417.8, "grounded_synthesis": 487.0, "inconsistent_api_recovery": 311.68, "relevance_detection_stateful": 16.98, "argument_fidelity_stateful": 61.55, "tool_selection_stateful": 50.68, "basic_2step_stateful": 33.91, "sequential_3step_stateful": 70.83, "conditional_routing_stateful": 184.17, "sequential_reasoning_stateful": 90.75, "error_recovery_stateful": 38.96, "data_gap_recovery_stateful": 215.45, "data_gap_recovery_extended_stateful": 421.32, "argument_transformation_stateful": 442.29, "grounded_synthesis_stateful": 519.98, "inconsistent_api_recovery_stateful": 293.12}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 44}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged:keep-last]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "keep-last", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 78.0, "accuracy": 78.8, "completeness": 98.9, "efficiency": 94.7, "wasted": 0.6, "speed": 3.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 82, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 70, "data_gap_recovery_extended": 28, "argument_transformation": 2, "grounded_synthesis": 52, "inconsistent_api_recovery": 78, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 74, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 80, "data_gap_recovery_extended_stateful": 18, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 92}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 41, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 35, "data_gap_recovery_extended": 14, "argument_transformation": 1, "grounded_synthesis": 26, "inconsistent_api_recovery": 39, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 37, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 46}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 44, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 48}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 44, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 48}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 164, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 175, "data_gap_recovery_extended": 112, "argument_transformation": 5, "grounded_synthesis": 260, "inconsistent_api_recovery": 312, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 148, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 200, "data_gap_recovery_extended_stateful": 72, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 240, "inconsistent_api_recovery_stateful": 368}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 152, "conditional_routing": 193, "sequential_reasoning": 205, "error_recovery": 150, "data_gap_recovery": 164, "data_gap_recovery_extended": 64, "argument_transformation": 7, "grounded_synthesis": 161, "inconsistent_api_recovery": 473, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 152, "conditional_routing_stateful": 177, "sequential_reasoning_stateful": 207, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 169, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 15, "grounded_synthesis_stateful": 205, "inconsistent_api_recovery_stateful": 554}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 2.0, "conditional_routing": 35.0, "sequential_reasoning": 5.0, "error_recovery": 50.0, "data_gap_recovery": 15.0, "data_gap_recovery_extended": 18.0, "argument_transformation": 51.0, "grounded_synthesis": 28.0, "inconsistent_api_recovery": 179.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 2.0, "conditional_routing_stateful": 33.0, "sequential_reasoning_stateful": 7.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 13.0, "data_gap_recovery_extended_stateful": 2.0, "argument_transformation_stateful": 31.0, "grounded_synthesis_stateful": 58.0, "inconsistent_api_recovery_stateful": 196.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 44, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 48}, "scenarioSpeedSum": {"relevance_detection": 16.97, "argument_fidelity": 58.2, "tool_selection": 48.8, "basic_2step": 29.68, "sequential_3step": 71.02, "conditional_routing": 193.5, "sequential_reasoning": 89.11, "error_recovery": 38.27, "data_gap_recovery": 218.52, "data_gap_recovery_extended": 471.36, "argument_transformation": 470.55, "grounded_synthesis": 489.81, "inconsistent_api_recovery": 287.27, "relevance_detection_stateful": 16.56, "argument_fidelity_stateful": 59.68, "tool_selection_stateful": 48.4, "basic_2step_stateful": 33.56, "sequential_3step_stateful": 164.77, "conditional_routing_stateful": 181.08, "sequential_reasoning_stateful": 90.66, "error_recovery_stateful": 37.92, "data_gap_recovery_stateful": 237.35, "data_gap_recovery_extended_stateful": 296.23, "argument_transformation_stateful": 417.11, "grounded_synthesis_stateful": 544.03, "inconsistent_api_recovery_stateful": 289.76}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 44, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 48}}, {"label": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [reforged:full]", "model": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "mistral-small-3.2", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 78.2, "accuracy": 84.3, "completeness": 92.7, "efficiency": 77.6, "wasted": 1.1, "speed": 3.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 28, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 100, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 34, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 76}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 14, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 38}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 2, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 2, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 70, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 500, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 85, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 304}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 150, "data_gap_recovery": 70, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 887, "inconsistent_api_recovery": 484, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 900, "inconsistent_api_recovery_stateful": 434}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 59.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 18.0, "grounded_synthesis": 392.0, "inconsistent_api_recovery": 139.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 31.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 48.0, "grounded_synthesis_stateful": 400.0, "inconsistent_api_recovery_stateful": 136.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 2, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 31.51, "argument_fidelity": 100.79, "tool_selection": 72.96, "basic_2step": 42.69, "sequential_3step": 106.2, "conditional_routing": 191.77, "sequential_reasoning": 114.42, "error_recovery": 58.31, "data_gap_recovery": 233.12, "data_gap_recovery_extended": 202.28, "argument_transformation": 16.3, "grounded_synthesis": 529.12, "inconsistent_api_recovery": 448.89, "relevance_detection_stateful": 30.6, "argument_fidelity_stateful": 100.84, "tool_selection_stateful": 72.56, "basic_2step_stateful": 62.05, "sequential_3step_stateful": 102.57, "conditional_routing_stateful": 198.68, "sequential_reasoning_stateful": 113.34, "error_recovery_stateful": 58.26, "data_gap_recovery_stateful": 192.37, "data_gap_recovery_extended_stateful": 205.83, "argument_transformation_stateful": 47.2, "grounded_synthesis_stateful": 525.85, "inconsistent_api_recovery_stateful": 454.37}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 2, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "gemma-4-E4B-it-Q8_0 LS/N [reforged:keep-last]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "keep-last", "family": "gemma4-e4b", "quant": "q8_0", "gen": 3, "retired": false, "score": 75.7, "accuracy": 79.3, "completeness": 95.5, "efficiency": 99.1, "wasted": 0.5, "speed": 13.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 76, "sequential_reasoning": 92, "error_recovery": 98, "data_gap_recovery": 96, "data_gap_recovery_extended": 4, "argument_transformation": 16, "grounded_synthesis": 82, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 86, "error_recovery_stateful": 98, "data_gap_recovery_stateful": 94, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 86, "inconsistent_api_recovery_stateful": 40}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 38, "sequential_reasoning": 46, "error_recovery": 49, "data_gap_recovery": 48, "data_gap_recovery_extended": 2, "argument_transformation": 8, "grounded_synthesis": 41, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 25, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 20}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 23}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 23}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 152, "sequential_reasoning": 184, "error_recovery": 98, "data_gap_recovery": 240, "data_gap_recovery_extended": 16, "argument_transformation": 40, "grounded_synthesis": 410, "inconsistent_api_recovery": 192, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 172, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 430, "inconsistent_api_recovery_stateful": 160}, "scenarioActualCalls": {"relevance_detection": 53, "argument_fidelity": 158, "tool_selection": 169, "basic_2step": 100, "sequential_3step": 163, "conditional_routing": 178, "sequential_reasoning": 199, "error_recovery": 196, "data_gap_recovery": 212, "data_gap_recovery_extended": 9, "argument_transformation": 31, "grounded_synthesis": 196, "inconsistent_api_recovery": 330, "relevance_detection_stateful": 51, "argument_fidelity_stateful": 160, "tool_selection_stateful": 168, "basic_2step_stateful": 101, "sequential_3step_stateful": 168, "conditional_routing_stateful": 117, "sequential_reasoning_stateful": 183, "error_recovery_stateful": 196, "data_gap_recovery_stateful": 224, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 182, "inconsistent_api_recovery_stateful": 269}, "scenarioWastedSum": {"relevance_detection": 3.0, "argument_fidelity": 8.0, "tool_selection": 19.0, "basic_2step": 0.0, "sequential_3step": 13.0, "conditional_routing": 33.0, "sequential_reasoning": 15.0, "error_recovery": 98.0, "data_gap_recovery": 10.0, "data_gap_recovery_extended": 11.0, "argument_transformation": 0.0, "grounded_synthesis": 10.0, "inconsistent_api_recovery": 139.0, "relevance_detection_stateful": 1.0, "argument_fidelity_stateful": 10.0, "tool_selection_stateful": 18.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 18.0, "conditional_routing_stateful": 23.0, "sequential_reasoning_stateful": 11.0, "error_recovery_stateful": 49.0, "data_gap_recovery_stateful": 18.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 4.0, "grounded_synthesis_stateful": 9.0, "inconsistent_api_recovery_stateful": 131.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 23}, "scenarioSpeedSum": {"relevance_detection": 126.73, "argument_fidelity": 215.32, "tool_selection": 246.32, "basic_2step": 100.92, "sequential_3step": 252.09, "conditional_routing": 697.23, "sequential_reasoning": 466.85, "error_recovery": 393.74, "data_gap_recovery": 986.66, "data_gap_recovery_extended": 1029.15, "argument_transformation": 1240.26, "grounded_synthesis": 1121.44, "inconsistent_api_recovery": 1326.3, "relevance_detection_stateful": 123.68, "argument_fidelity_stateful": 227.61, "tool_selection_stateful": 239.43, "basic_2step_stateful": 105.79, "sequential_3step_stateful": 271.91, "conditional_routing_stateful": 635.49, "sequential_reasoning_stateful": 427.17, "error_recovery_stateful": 420.73, "data_gap_recovery_stateful": 992.59, "data_gap_recovery_extended_stateful": 995.08, "argument_transformation_stateful": 1292.96, "grounded_synthesis_stateful": 1111.62, "inconsistent_api_recovery_stateful": 1255.92}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 23}}, {"label": "gemma-4-E4B-it-Q8_0 LS/N [reforged:full]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "gemma4-e4b", "quant": "q8_0", "gen": 3, "retired": false, "score": 75.6, "accuracy": 80.8, "completeness": 93.6, "efficiency": 98.0, "wasted": 0.6, "speed": 12.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 92, "sequential_reasoning": 94, "error_recovery": 92, "data_gap_recovery": 88, "data_gap_recovery_extended": 0, "argument_transformation": 18, "grounded_synthesis": 82, "inconsistent_api_recovery": 28, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 76, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 94, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 84, "inconsistent_api_recovery_stateful": 26}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 46, "sequential_reasoning": 47, "error_recovery": 46, "data_gap_recovery": 44, "data_gap_recovery_extended": 0, "argument_transformation": 9, "grounded_synthesis": 41, "inconsistent_api_recovery": 14, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 13}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 47, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 15}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 47, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 15}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 184, "sequential_reasoning": 188, "error_recovery": 92, "data_gap_recovery": 220, "data_gap_recovery_extended": 0, "argument_transformation": 45, "grounded_synthesis": 410, "inconsistent_api_recovery": 112, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 152, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 141, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 420, "inconsistent_api_recovery_stateful": 104}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 154, "tool_selection": 170, "basic_2step": 100, "sequential_3step": 173, "conditional_routing": 209, "sequential_reasoning": 201, "error_recovery": 215, "data_gap_recovery": 242, "data_gap_recovery_extended": 0, "argument_transformation": 35, "grounded_synthesis": 190, "inconsistent_api_recovery": 196, "relevance_detection_stateful": 52, "argument_fidelity_stateful": 153, "tool_selection_stateful": 173, "basic_2step_stateful": 100, "sequential_3step_stateful": 167, "conditional_routing_stateful": 161, "sequential_reasoning_stateful": 201, "error_recovery_stateful": 228, "data_gap_recovery_stateful": 285, "data_gap_recovery_extended_stateful": 6, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 173, "inconsistent_api_recovery_stateful": 147}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 4.0, "tool_selection": 20.0, "basic_2step": 0.0, "sequential_3step": 23.0, "conditional_routing": 33.0, "sequential_reasoning": 14.0, "error_recovery": 125.0, "data_gap_recovery": 46.0, "data_gap_recovery_extended": 10.0, "argument_transformation": 0.0, "grounded_synthesis": 17.0, "inconsistent_api_recovery": 90.0, "relevance_detection_stateful": 2.0, "argument_fidelity_stateful": 3.0, "tool_selection_stateful": 23.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 17.0, "conditional_routing_stateful": 22.0, "sequential_reasoning_stateful": 17.0, "error_recovery_stateful": 87.0, "data_gap_recovery_stateful": 56.0, "data_gap_recovery_extended_stateful": 25.0, "argument_transformation_stateful": 3.0, "grounded_synthesis_stateful": 7.0, "inconsistent_api_recovery_stateful": 55.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 47, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 15}, "scenarioSpeedSum": {"relevance_detection": 114.6, "argument_fidelity": 217.41, "tool_selection": 246.69, "basic_2step": 92.28, "sequential_3step": 270.53, "conditional_routing": 661.12, "sequential_reasoning": 448.59, "error_recovery": 438.73, "data_gap_recovery": 1070.81, "data_gap_recovery_extended": 1052.91, "argument_transformation": 1156.92, "grounded_synthesis": 1029.03, "inconsistent_api_recovery": 848.9, "relevance_detection_stateful": 119.77, "argument_fidelity_stateful": 208.45, "tool_selection_stateful": 241.07, "basic_2step_stateful": 95.35, "sequential_3step_stateful": 244.03, "conditional_routing_stateful": 592.51, "sequential_reasoning_stateful": 413.05, "error_recovery_stateful": 419.07, "data_gap_recovery_stateful": 1118.78, "data_gap_recovery_extended_stateful": 1154.52, "argument_transformation_stateful": 1157.14, "grounded_synthesis_stateful": 1083.33, "inconsistent_api_recovery_stateful": 680.79}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 47, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 15}}, {"label": "phi-4-Q4_K_M LS/P [reforged]", "model": "phi-4-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "none", "family": "phi-4", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 75.3, "accuracy": 75.4, "completeness": 99.8, "efficiency": 82.8, "wasted": 0.9, "speed": 4.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 26, "sequential_reasoning": 62, "error_recovery": 94, "data_gap_recovery": 96, "data_gap_recovery_extended": 62, "argument_transformation": 34, "grounded_synthesis": 66, "inconsistent_api_recovery": 70, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 28, "sequential_reasoning_stateful": 84, "error_recovery_stateful": 98, "data_gap_recovery_stateful": 94, "data_gap_recovery_extended_stateful": 42, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 60, "inconsistent_api_recovery_stateful": 42}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 13, "sequential_reasoning": 31, "error_recovery": 47, "data_gap_recovery": 48, "data_gap_recovery_extended": 31, "argument_transformation": 17, "grounded_synthesis": 33, "inconsistent_api_recovery": 35, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 14, "sequential_reasoning_stateful": 42, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 21, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 21}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 52, "sequential_reasoning": 124, "error_recovery": 94, "data_gap_recovery": 240, "data_gap_recovery_extended": 248, "argument_transformation": 85, "grounded_synthesis": 330, "inconsistent_api_recovery": 280, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 56, "sequential_reasoning_stateful": 168, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 168, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 300, "inconsistent_api_recovery_stateful": 168}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 69, "sequential_reasoning": 124, "error_recovery": 148, "data_gap_recovery": 306, "data_gap_recovery_extended": 212, "argument_transformation": 64, "grounded_synthesis": 524, "inconsistent_api_recovery": 424, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 109, "sequential_3step_stateful": 150, "conditional_routing_stateful": 73, "sequential_reasoning_stateful": 168, "error_recovery_stateful": 155, "data_gap_recovery_stateful": 308, "data_gap_recovery_extended_stateful": 146, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 511, "inconsistent_api_recovery_stateful": 265}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 25.0, "sequential_reasoning": 0.0, "error_recovery": 60.0, "data_gap_recovery": 68.0, "data_gap_recovery_extended": 2.0, "argument_transformation": 9.0, "grounded_synthesis": 334.0, "inconsistent_api_recovery": 160.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 9.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 30.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 8.0, "data_gap_recovery_stateful": 74.0, "data_gap_recovery_extended_stateful": 12.0, "argument_transformation_stateful": 2.0, "grounded_synthesis_stateful": 310.0, "inconsistent_api_recovery_stateful": 123.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 22.47, "argument_fidelity": 55.4, "tool_selection": 45.78, "basic_2step": 31.77, "sequential_3step": 49.96, "conditional_routing": 234.48, "sequential_reasoning": 77.62, "error_recovery": 61.96, "data_gap_recovery": 220.8, "data_gap_recovery_extended": 311.36, "argument_transformation": 631.25, "grounded_synthesis": 635.32, "inconsistent_api_recovery": 424.25, "relevance_detection_stateful": 23.98, "argument_fidelity_stateful": 53.97, "tool_selection_stateful": 45.33, "basic_2step_stateful": 52.97, "sequential_3step_stateful": 50.01, "conditional_routing_stateful": 244.59, "sequential_reasoning_stateful": 81.45, "error_recovery_stateful": 51.82, "data_gap_recovery_stateful": 229.94, "data_gap_recovery_extended_stateful": 260.03, "argument_transformation_stateful": 555.38, "grounded_synthesis_stateful": 609.1, "inconsistent_api_recovery_stateful": 444.09}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma4:e4b-it-q4_K_M OL/N [reforged:full]", "model": "gemma4:e4b-it-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 74.8, "accuracy": 75.0, "completeness": 99.8, "efficiency": 82.6, "wasted": 0.8, "speed": 11.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 92, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 44, "inconsistent_api_recovery": 66, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 90, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 78, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 42}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 46, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 22, "inconsistent_api_recovery": 33, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 21}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 230, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 220, "inconsistent_api_recovery": 264, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 180, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 195, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 200, "inconsistent_api_recovery_stateful": 168}, "scenarioActualCalls": {"relevance_detection": 51, "argument_fidelity": 150, "tool_selection": 155, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 237, "sequential_reasoning": 200, "error_recovery": 157, "data_gap_recovery": 269, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 321, "inconsistent_api_recovery": 446, "relevance_detection_stateful": 52, "argument_fidelity_stateful": 150, "tool_selection_stateful": 157, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 227, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 153, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 283, "inconsistent_api_recovery_stateful": 289}, "scenarioWastedSum": {"relevance_detection": 1.0, "argument_fidelity": 0.0, "tool_selection": 5.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 49.0, "sequential_reasoning": 0.0, "error_recovery": 57.0, "data_gap_recovery": 45.0, "data_gap_recovery_extended": 16.0, "argument_transformation": 0.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 253.0, "relevance_detection_stateful": 2.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 7.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 47.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 3.0, "data_gap_recovery_stateful": 67.0, "data_gap_recovery_extended_stateful": 3.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 139.0, "inconsistent_api_recovery_stateful": 246.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 107.13, "argument_fidelity": 106.86, "tool_selection": 121.34, "basic_2step": 76.55, "sequential_3step": 110.42, "conditional_routing": 469.76, "sequential_reasoning": 210.93, "error_recovery": 146.49, "data_gap_recovery": 699.04, "data_gap_recovery_extended": 788.71, "argument_transformation": 1014.45, "grounded_synthesis": 927.35, "inconsistent_api_recovery": 1715.81, "relevance_detection_stateful": 122.87, "argument_fidelity_stateful": 132.41, "tool_selection_stateful": 155.18, "basic_2step_stateful": 69.27, "sequential_3step_stateful": 137.81, "conditional_routing_stateful": 568.16, "sequential_reasoning_stateful": 245.26, "error_recovery_stateful": 179.63, "data_gap_recovery_stateful": 983.12, "data_gap_recovery_extended_stateful": 880.46, "argument_transformation_stateful": 1347.08, "grounded_synthesis_stateful": 1011.46, "inconsistent_api_recovery_stateful": 2345.1}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "ministral-3:14b-instruct-2512-q4_K_M OL/N [reforged:full]", "model": "ministral-3:14b-instruct-2512-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "ministral-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 74.8, "accuracy": 74.8, "completeness": 100.0, "efficiency": 81.3, "wasted": 1.0, "speed": 6.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 96, "data_gap_recovery_extended": 56, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 80, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 16}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 28, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 8}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 240, "data_gap_recovery_extended": 224, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 320, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 64}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 150, "sequential_3step": 150, "conditional_routing": 252, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 319, "data_gap_recovery_extended": 312, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 417, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 150, "sequential_3step_stateful": 150, "conditional_routing_stateful": 264, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 336, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 111}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 50.0, "sequential_3step": 0.0, "conditional_routing": 52.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 81.0, "data_gap_recovery_extended": 167.0, "argument_transformation": 120.0, "grounded_synthesis": 21.0, "inconsistent_api_recovery": 131.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 64.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 92.0, "data_gap_recovery_extended_stateful": 10.0, "argument_transformation_stateful": 115.0, "grounded_synthesis_stateful": 31.0, "inconsistent_api_recovery_stateful": 165.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 22.88, "argument_fidelity": 72.85, "tool_selection": 45.85, "basic_2step": 52.41, "sequential_3step": 97.11, "conditional_routing": 431.22, "sequential_reasoning": 197.04, "error_recovery": 35.26, "data_gap_recovery": 364.63, "data_gap_recovery_extended": 631.08, "argument_transformation": 591.71, "grounded_synthesis": 706.31, "inconsistent_api_recovery": 873.37, "relevance_detection_stateful": 22.94, "argument_fidelity_stateful": 72.97, "tool_selection_stateful": 45.95, "basic_2step_stateful": 41.57, "sequential_3step_stateful": 111.88, "conditional_routing_stateful": 455.42, "sequential_reasoning_stateful": 196.41, "error_recovery_stateful": 35.4, "data_gap_recovery_stateful": 238.11, "data_gap_recovery_extended_stateful": 411.71, "argument_transformation_stateful": 641.57, "grounded_synthesis_stateful": 727.5, "inconsistent_api_recovery_stateful": 921.98}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [reforged]", "model": "Ministral-3-8B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "none", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 74.9, "accuracy": 83.3, "completeness": 89.9, "efficiency": 78.6, "wasted": 1.3, "speed": 3.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 20, "argument_transformation": 2, "grounded_synthesis": 42, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 22, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 62, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 10, "argument_transformation": 1, "grounded_synthesis": 21, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 28, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 28, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 80, "argument_transformation": 5, "grounded_synthesis": 210, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 88, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 310, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 148, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 344, "error_recovery": 150, "data_gap_recovery": 150, "data_gap_recovery_extended": 60, "argument_transformation": 5, "grounded_synthesis": 294, "inconsistent_api_recovery": 650, "relevance_detection_stateful": 149, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 325, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 72, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 421, "inconsistent_api_recovery_stateful": 650}, "scenarioWastedSum": {"relevance_detection": 98.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 144.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 36.0, "grounded_synthesis": 117.0, "inconsistent_api_recovery": 250.0, "relevance_detection_stateful": 99.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 125.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 18.0, "grounded_synthesis_stateful": 195.0, "inconsistent_api_recovery_stateful": 250.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 28, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 36.19, "argument_fidelity": 39.21, "tool_selection": 0.0, "basic_2step": 20.54, "sequential_3step": 95.29, "conditional_routing": 117.13, "sequential_reasoning": 196.48, "error_recovery": 24.11, "data_gap_recovery": 124.23, "data_gap_recovery_extended": 348.97, "argument_transformation": 356.15, "grounded_synthesis": 200.66, "inconsistent_api_recovery": 232.09, "relevance_detection_stateful": 35.93, "argument_fidelity_stateful": 39.07, "tool_selection_stateful": 0.0, "basic_2step_stateful": 22.07, "sequential_3step_stateful": 95.41, "conditional_routing_stateful": 116.97, "sequential_reasoning_stateful": 176.9, "error_recovery_stateful": 24.12, "data_gap_recovery_stateful": 126.56, "data_gap_recovery_extended_stateful": 351.0, "argument_transformation_stateful": 328.21, "grounded_synthesis_stateful": 287.12, "inconsistent_api_recovery_stateful": 231.0}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 28, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q8_0 LS/P [reforged:full]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "gemma4-e4b", "quant": "q8_0", "gen": 3, "retired": false, "score": 73.7, "accuracy": 73.7, "completeness": 100.0, "efficiency": 85.9, "wasted": 0.6, "speed": 12.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 48, "sequential_reasoning": 100, "error_recovery": 92, "data_gap_recovery": 90, "data_gap_recovery_extended": 0, "argument_transformation": 36, "grounded_synthesis": 28, "inconsistent_api_recovery": 88, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 94, "error_recovery_stateful": 98, "data_gap_recovery_stateful": 82, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 92}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 24, "sequential_reasoning": 50, "error_recovery": 46, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 18, "grounded_synthesis": 14, "inconsistent_api_recovery": 44, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 24, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 41, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 46}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 96, "sequential_reasoning": 200, "error_recovery": 92, "data_gap_recovery": 225, "data_gap_recovery_extended": 0, "argument_transformation": 90, "grounded_synthesis": 140, "inconsistent_api_recovery": 352, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 205, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 368}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 127, "sequential_reasoning": 200, "error_recovery": 144, "data_gap_recovery": 256, "data_gap_recovery_extended": 0, "argument_transformation": 72, "grounded_synthesis": 161, "inconsistent_api_recovery": 529, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 144, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 151, "data_gap_recovery_stateful": 231, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 112, "inconsistent_api_recovery_stateful": 558}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 43.0, "sequential_reasoning": 0.0, "error_recovery": 58.0, "data_gap_recovery": 33.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 53.0, "inconsistent_api_recovery": 202.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 54.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 4.0, "data_gap_recovery_stateful": 31.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 34.0, "inconsistent_api_recovery_stateful": 204.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 58.16, "argument_fidelity": 160.25, "tool_selection": 157.3, "basic_2step": 81.48, "sequential_3step": 147.61, "conditional_routing": 670.28, "sequential_reasoning": 273.45, "error_recovery": 477.87, "data_gap_recovery": 784.14, "data_gap_recovery_extended": 792.14, "argument_transformation": 1428.34, "grounded_synthesis": 1247.52, "inconsistent_api_recovery": 1996.7, "relevance_detection_stateful": 52.88, "argument_fidelity_stateful": 176.3, "tool_selection_stateful": 153.22, "basic_2step_stateful": 77.5, "sequential_3step_stateful": 146.95, "conditional_routing_stateful": 689.08, "sequential_reasoning_stateful": 275.8, "error_recovery_stateful": 428.75, "data_gap_recovery_stateful": 772.38, "data_gap_recovery_extended_stateful": 788.71, "argument_transformation_stateful": 1419.16, "grounded_synthesis_stateful": 1198.52, "inconsistent_api_recovery_stateful": 1991.66}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma4:e4b-it-q8_0 OL/N [reforged:full]", "model": "gemma4:e4b-it-q8_0", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "gemma4-e4b", "quant": "q8_0", "gen": 2, "retired": false, "score": 73.6, "accuracy": 73.8, "completeness": 99.8, "efficiency": 85.3, "wasted": 0.8, "speed": 12.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 78, "sequential_reasoning": 98, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 34, "inconsistent_api_recovery": 60, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 78, "sequential_reasoning_stateful": 94, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 96, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 34}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 39, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 17, "inconsistent_api_recovery": 30, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 39, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 17, "inconsistent_api_recovery_stateful": 17}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 156, "sequential_reasoning": 196, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 20, "grounded_synthesis": 170, "inconsistent_api_recovery": 240, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 156, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 240, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 170, "inconsistent_api_recovery_stateful": 136}, "scenarioActualCalls": {"relevance_detection": 52, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 193, "sequential_reasoning": 196, "error_recovery": 157, "data_gap_recovery": 275, "data_gap_recovery_extended": 0, "argument_transformation": 19, "grounded_synthesis": 251, "inconsistent_api_recovery": 368, "relevance_detection_stateful": 51, "argument_fidelity_stateful": 150, "tool_selection_stateful": 151, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 194, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 154, "data_gap_recovery_stateful": 259, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 282, "inconsistent_api_recovery_stateful": 212}, "scenarioWastedSum": {"relevance_detection": 2.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 37.0, "sequential_reasoning": 0.0, "error_recovery": 57.0, "data_gap_recovery": 33.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 163.0, "inconsistent_api_recovery": 204.0, "relevance_detection_stateful": 1.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 1.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 39.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 4.0, "data_gap_recovery_stateful": 27.0, "data_gap_recovery_extended_stateful": 14.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 193.0, "inconsistent_api_recovery_stateful": 205.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 120.14, "argument_fidelity": 135.63, "tool_selection": 148.33, "basic_2step": 82.98, "sequential_3step": 149.89, "conditional_routing": 572.98, "sequential_reasoning": 241.63, "error_recovery": 211.98, "data_gap_recovery": 749.8, "data_gap_recovery_extended": 857.51, "argument_transformation": 1557.36, "grounded_synthesis": 1195.18, "inconsistent_api_recovery": 2237.69, "relevance_detection_stateful": 121.56, "argument_fidelity_stateful": 141.51, "tool_selection_stateful": 136.38, "basic_2step_stateful": 76.93, "sequential_3step_stateful": 155.35, "conditional_routing_stateful": 558.35, "sequential_reasoning_stateful": 239.51, "error_recovery_stateful": 197.59, "data_gap_recovery_stateful": 780.02, "data_gap_recovery_extended_stateful": 881.04, "argument_transformation_stateful": 1517.79, "grounded_synthesis_stateful": 1292.08, "inconsistent_api_recovery_stateful": 2178.44}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "gemma-4-E4B-it-Q8_0 LS/P [reforged:keep-last]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "keep-last", "family": "gemma4-e4b", "quant": "q8_0", "gen": 3, "retired": false, "score": 74.1, "accuracy": 74.2, "completeness": 99.8, "efficiency": 84.8, "wasted": 0.6, "speed": 13.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 48, "sequential_reasoning": 100, "error_recovery": 96, "data_gap_recovery": 84, "data_gap_recovery_extended": 0, "argument_transformation": 22, "grounded_synthesis": 28, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 52, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 98, "data_gap_recovery_stateful": 90, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 18, "inconsistent_api_recovery_stateful": 96}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 24, "sequential_reasoning": 50, "error_recovery": 48, "data_gap_recovery": 42, "data_gap_recovery_extended": 0, "argument_transformation": 11, "grounded_synthesis": 14, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 26, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 9, "inconsistent_api_recovery_stateful": 48}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 96, "sequential_reasoning": 200, "error_recovery": 96, "data_gap_recovery": 210, "data_gap_recovery_extended": 0, "argument_transformation": 55, "grounded_synthesis": 140, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 104, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 90, "inconsistent_api_recovery_stateful": 384}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 139, "sequential_reasoning": 200, "error_recovery": 154, "data_gap_recovery": 241, "data_gap_recovery_extended": 0, "argument_transformation": 46, "grounded_synthesis": 162, "inconsistent_api_recovery": 573, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 156, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 152, "data_gap_recovery_stateful": 256, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 99, "inconsistent_api_recovery_stateful": 578}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 55.0, "sequential_reasoning": 0.0, "error_recovery": 60.0, "data_gap_recovery": 35.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 48.0, "inconsistent_api_recovery": 204.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 61.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 6.0, "data_gap_recovery_stateful": 35.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 47.0, "inconsistent_api_recovery_stateful": 197.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 47.7, "argument_fidelity": 177.33, "tool_selection": 161.7, "basic_2step": 77.96, "sequential_3step": 157.96, "conditional_routing": 724.58, "sequential_reasoning": 281.86, "error_recovery": 580.79, "data_gap_recovery": 832.63, "data_gap_recovery_extended": 894.97, "argument_transformation": 1486.95, "grounded_synthesis": 1258.89, "inconsistent_api_recovery": 2080.29, "relevance_detection_stateful": 51.43, "argument_fidelity_stateful": 168.52, "tool_selection_stateful": 158.41, "basic_2step_stateful": 82.8, "sequential_3step_stateful": 153.8, "conditional_routing_stateful": 743.57, "sequential_reasoning_stateful": 303.57, "error_recovery_stateful": 540.4, "data_gap_recovery_stateful": 839.73, "data_gap_recovery_extended_stateful": 848.89, "argument_transformation_stateful": 1543.56, "grounded_synthesis_stateful": 1305.2, "inconsistent_api_recovery_stateful": 1951.25}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "claude-haiku-4-5-20251001 AN/N [bare+any]", "model": "claude-haiku-4-5-20251001", "backend": "anthropic", "mode": "native", "ablation": "bare", "replay": "none", "family": "claude", "quant": "n/a", "gen": 3, "retired": false, "score": 74.0, "accuracy": 80.2, "completeness": 92.3, "efficiency": 100.0, "wasted": 0.0, "speed": 5.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 86, "argument_transformation": 0, "grounded_synthesis": 32, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 84, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 22, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 43, "argument_transformation": 0, "grounded_synthesis": 16, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 42, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 11, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 344, "argument_transformation": 0, "grounded_synthesis": 160, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 336, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 110, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 100, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 152, "data_gap_recovery_extended": 175, "argument_transformation": 0, "grounded_synthesis": 72, "inconsistent_api_recovery": 202, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 152, "data_gap_recovery_extended_stateful": 177, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 51, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 2.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 5.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 63.75, "argument_fidelity": 161.12, "tool_selection": 162.89, "basic_2step": 81.97, "sequential_3step": 156.28, "conditional_routing": 264.5, "sequential_reasoning": 263.55, "error_recovery": 0.0, "data_gap_recovery": 276.89, "data_gap_recovery_extended": 415.92, "argument_transformation": 337.53, "grounded_synthesis": 556.86, "inconsistent_api_recovery": 326.44, "relevance_detection_stateful": 60.18, "argument_fidelity_stateful": 192.62, "tool_selection_stateful": 161.9, "basic_2step_stateful": 118.04, "sequential_3step_stateful": 178.21, "conditional_routing_stateful": 253.75, "sequential_reasoning_stateful": 237.04, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 326.97, "data_gap_recovery_extended_stateful": 468.17, "argument_transformation_stateful": 328.72, "grounded_synthesis_stateful": 544.15, "inconsistent_api_recovery_stateful": 299.22}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-27B-Q4_K_M LS/P [bare:full]", "model": "Qwen3.5-27B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "qwen3.5-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 74.3, "accuracy": 81.0, "completeness": 91.8, "efficiency": 100.0, "wasted": 0.0, "speed": 24.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 92, "data_gap_recovery_extended": 38, "argument_transformation": 14, "grounded_synthesis": 84, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 68, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 88, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 19, "argument_transformation": 7, "grounded_synthesis": 42, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 22, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 230, "data_gap_recovery_extended": 152, "argument_transformation": 35, "grounded_synthesis": 420, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 136, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 176, "argument_transformation_stateful": 15, "grounded_synthesis_stateful": 440, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 214, "data_gap_recovery_extended": 77, "argument_transformation": 30, "grounded_synthesis": 258, "inconsistent_api_recovery": 221, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 122, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 223, "data_gap_recovery_extended_stateful": 88, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 255, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 8.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 1.0, "grounded_synthesis": 5.0, "inconsistent_api_recovery": 6.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 5.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 6.0, "inconsistent_api_recovery_stateful": 2.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 205.2, "argument_fidelity": 492.05, "tool_selection": 339.76, "basic_2step": 254.0, "sequential_3step": 495.57, "conditional_routing": 1160.72, "sequential_reasoning": 678.59, "error_recovery": 0.0, "data_gap_recovery": 1346.02, "data_gap_recovery_extended": 1533.84, "argument_transformation": 2894.93, "grounded_synthesis": 2945.57, "inconsistent_api_recovery": 1920.33, "relevance_detection_stateful": 211.42, "argument_fidelity_stateful": 497.9, "tool_selection_stateful": 331.17, "basic_2step_stateful": 284.87, "sequential_3step_stateful": 472.13, "conditional_routing_stateful": 1167.08, "sequential_reasoning_stateful": 657.21, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1422.21, "data_gap_recovery_extended_stateful": 1584.39, "argument_transformation_stateful": 2718.37, "grounded_synthesis_stateful": 2977.57, "inconsistent_api_recovery_stateful": 1989.43}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q8_0 LS/P [reforged:keep-last]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "keep-last", "family": "qwen3-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 72.8, "accuracy": 73.0, "completeness": 99.7, "efficiency": 88.5, "wasted": 0.4, "speed": 28.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 98, "sequential_reasoning": 98, "error_recovery": 80, "data_gap_recovery": 90, "data_gap_recovery_extended": 0, "argument_transformation": 6, "grounded_synthesis": 8, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 96, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 60, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 54}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 49, "error_recovery": 40, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 3, "grounded_synthesis": 4, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 30, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 27}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 196, "sequential_reasoning": 196, "error_recovery": 80, "data_gap_recovery": 225, "data_gap_recovery_extended": 0, "argument_transformation": 15, "grounded_synthesis": 40, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 90, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 60, "inconsistent_api_recovery_stateful": 216}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 152, "basic_2step": 101, "sequential_3step": 151, "conditional_routing": 260, "sequential_reasoning": 196, "error_recovery": 121, "data_gap_recovery": 232, "data_gap_recovery_extended": 0, "argument_transformation": 12, "grounded_synthesis": 40, "inconsistent_api_recovery": 515, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 153, "basic_2step_stateful": 100, "sequential_3step_stateful": 146, "conditional_routing_stateful": 247, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 93, "data_gap_recovery_stateful": 253, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 58, "inconsistent_api_recovery_stateful": 327}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 2.0, "basic_2step": 1.0, "sequential_3step": 1.0, "conditional_routing": 64.0, "sequential_reasoning": 0.0, "error_recovery": 51.0, "data_gap_recovery": 9.0, "data_gap_recovery_extended": 3.0, "argument_transformation": 0.0, "grounded_synthesis": 5.0, "inconsistent_api_recovery": 145.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 3.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 7.0, "conditional_routing_stateful": 55.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 4.0, "data_gap_recovery_stateful": 9.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 7.0, "inconsistent_api_recovery_stateful": 150.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 80.34, "argument_fidelity": 287.17, "tool_selection": 361.62, "basic_2step": 310.06, "sequential_3step": 678.02, "conditional_routing": 1297.74, "sequential_reasoning": 473.18, "error_recovery": 714.87, "data_gap_recovery": 954.42, "data_gap_recovery_extended": 1620.5, "argument_transformation": 3035.69, "grounded_synthesis": 3171.9, "inconsistent_api_recovery": 5077.87, "relevance_detection_stateful": 79.62, "argument_fidelity_stateful": 290.83, "tool_selection_stateful": 365.14, "basic_2step_stateful": 280.63, "sequential_3step_stateful": 743.2, "conditional_routing_stateful": 1270.07, "sequential_reasoning_stateful": 460.92, "error_recovery_stateful": 810.11, "data_gap_recovery_stateful": 976.56, "data_gap_recovery_extended_stateful": 1597.05, "argument_transformation_stateful": 2745.3, "grounded_synthesis_stateful": 3577.54, "inconsistent_api_recovery_stateful": 5084.44}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "Qwen3-8B-Q8_0 LS/P [reforged:full]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "qwen3-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 72.8, "accuracy": 72.9, "completeness": 99.8, "efficiency": 88.4, "wasted": 0.4, "speed": 28.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 98, "sequential_reasoning": 100, "error_recovery": 70, "data_gap_recovery": 90, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 20, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 92, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 96, "error_recovery_stateful": 66, "data_gap_recovery_stateful": 92, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 56}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 35, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 10, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 33, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 28}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 196, "sequential_reasoning": 200, "error_recovery": 70, "data_gap_recovery": 225, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 100, "inconsistent_api_recovery": 384, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 138, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 99, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 60, "inconsistent_api_recovery_stateful": 224}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 153, "basic_2step": 100, "sequential_3step": 151, "conditional_routing": 261, "sequential_reasoning": 200, "error_recovery": 107, "data_gap_recovery": 229, "data_gap_recovery_extended": 0, "argument_transformation": 6, "grounded_synthesis": 90, "inconsistent_api_recovery": 553, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 152, "basic_2step_stateful": 100, "sequential_3step_stateful": 139, "conditional_routing_stateful": 264, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 51, "inconsistent_api_recovery_stateful": 340}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 3.0, "basic_2step": 0.0, "sequential_3step": 1.0, "conditional_routing": 65.0, "sequential_reasoning": 0.0, "error_recovery": 52.0, "data_gap_recovery": 9.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 5.0, "inconsistent_api_recovery": 172.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 2.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 8.0, "conditional_routing_stateful": 64.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 1.0, "data_gap_recovery_stateful": 9.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 7.0, "inconsistent_api_recovery_stateful": 165.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 77.12, "argument_fidelity": 305.19, "tool_selection": 370.45, "basic_2step": 269.44, "sequential_3step": 731.57, "conditional_routing": 1276.94, "sequential_reasoning": 476.81, "error_recovery": 779.32, "data_gap_recovery": 962.17, "data_gap_recovery_extended": 1528.65, "argument_transformation": 2559.83, "grounded_synthesis": 3708.25, "inconsistent_api_recovery": 5668.85, "relevance_detection_stateful": 79.47, "argument_fidelity_stateful": 294.35, "tool_selection_stateful": 356.11, "basic_2step_stateful": 282.8, "sequential_3step_stateful": 750.29, "conditional_routing_stateful": 1289.0, "sequential_reasoning_stateful": 482.42, "error_recovery_stateful": 787.48, "data_gap_recovery_stateful": 987.64, "data_gap_recovery_extended_stateful": 1625.52, "argument_transformation_stateful": 2749.65, "grounded_synthesis_stateful": 3571.61, "inconsistent_api_recovery_stateful": 5591.71}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/P [reforged:keep-last]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "keep-last", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 73.2, "accuracy": 73.2, "completeness": 100.0, "efficiency": 84.5, "wasted": 0.6, "speed": 8.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 40, "sequential_reasoning": 98, "error_recovery": 96, "data_gap_recovery": 92, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 24, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 86, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 92}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 20, "sequential_reasoning": 49, "error_recovery": 48, "data_gap_recovery": 46, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 12, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 19, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 46}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 80, "sequential_reasoning": 196, "error_recovery": 96, "data_gap_recovery": 230, "data_gap_recovery_extended": 0, "argument_transformation": 25, "grounded_synthesis": 120, "inconsistent_api_recovery": 384, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 76, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 215, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 160, "inconsistent_api_recovery_stateful": 368}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 95, "sequential_reasoning": 196, "error_recovery": 146, "data_gap_recovery": 255, "data_gap_recovery_extended": 0, "argument_transformation": 20, "grounded_synthesis": 158, "inconsistent_api_recovery": 589, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 104, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 232, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 230, "inconsistent_api_recovery_stateful": 566}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 17.0, "sequential_reasoning": 0.0, "error_recovery": 53.0, "data_gap_recovery": 30.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 119.0, "inconsistent_api_recovery": 210.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 28.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 25.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 117.0, "inconsistent_api_recovery_stateful": 211.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 37.66, "argument_fidelity": 83.02, "tool_selection": 108.76, "basic_2step": 47.53, "sequential_3step": 100.9, "conditional_routing": 466.24, "sequential_reasoning": 175.58, "error_recovery": 211.37, "data_gap_recovery": 502.79, "data_gap_recovery_extended": 463.95, "argument_transformation": 1082.98, "grounded_synthesis": 893.71, "inconsistent_api_recovery": 1309.5, "relevance_detection_stateful": 31.04, "argument_fidelity_stateful": 79.82, "tool_selection_stateful": 104.69, "basic_2step_stateful": 45.91, "sequential_3step_stateful": 100.56, "conditional_routing_stateful": 458.47, "sequential_reasoning_stateful": 176.93, "error_recovery_stateful": 132.2, "data_gap_recovery_stateful": 525.51, "data_gap_recovery_extended_stateful": 450.71, "argument_transformation_stateful": 1241.76, "grounded_synthesis_stateful": 852.45, "inconsistent_api_recovery_stateful": 1325.83}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/P [reforged:full]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 72.9, "accuracy": 72.9, "completeness": 100.0, "efficiency": 84.9, "wasted": 0.6, "speed": 8.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 54, "sequential_reasoning": 98, "error_recovery": 98, "data_gap_recovery": 86, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 28, "inconsistent_api_recovery": 88, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 26, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 96, "data_gap_recovery_stateful": 94, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 90}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 27, "sequential_reasoning": 49, "error_recovery": 49, "data_gap_recovery": 43, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 14, "inconsistent_api_recovery": 44, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 13, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 15, "inconsistent_api_recovery_stateful": 45}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 108, "sequential_reasoning": 196, "error_recovery": 98, "data_gap_recovery": 215, "data_gap_recovery_extended": 0, "argument_transformation": 25, "grounded_synthesis": 140, "inconsistent_api_recovery": 352, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 52, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 360}, "scenarioActualCalls": {"relevance_detection": 51, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 135, "sequential_reasoning": 196, "error_recovery": 148, "data_gap_recovery": 237, "data_gap_recovery_extended": 0, "argument_transformation": 20, "grounded_synthesis": 190, "inconsistent_api_recovery": 530, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 70, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 145, "data_gap_recovery_stateful": 258, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 207, "inconsistent_api_recovery_stateful": 555}, "scenarioWastedSum": {"relevance_detection": 1.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 27.0, "sequential_reasoning": 0.0, "error_recovery": 52.0, "data_gap_recovery": 26.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 116.0, "inconsistent_api_recovery": 198.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 20.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 2.0, "data_gap_recovery_stateful": 27.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 122.0, "inconsistent_api_recovery_stateful": 210.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 35.16, "argument_fidelity": 85.18, "tool_selection": 109.31, "basic_2step": 47.89, "sequential_3step": 107.7, "conditional_routing": 477.37, "sequential_reasoning": 188.86, "error_recovery": 192.63, "data_gap_recovery": 524.23, "data_gap_recovery_extended": 455.36, "argument_transformation": 1135.73, "grounded_synthesis": 873.97, "inconsistent_api_recovery": 1296.69, "relevance_detection_stateful": 31.34, "argument_fidelity_stateful": 84.67, "tool_selection_stateful": 103.28, "basic_2step_stateful": 48.33, "sequential_3step_stateful": 102.34, "conditional_routing_stateful": 465.25, "sequential_reasoning_stateful": 175.49, "error_recovery_stateful": 197.65, "data_gap_recovery_stateful": 539.94, "data_gap_recovery_extended_stateful": 433.97, "argument_transformation_stateful": 1155.58, "grounded_synthesis_stateful": 848.08, "inconsistent_api_recovery_stateful": 1330.45}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q8_0 LS/P [reforged]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "none", "family": "gemma4-e4b", "quant": "q8_0", "gen": 3, "retired": false, "score": 73.2, "accuracy": 73.3, "completeness": 99.8, "efficiency": 85.1, "wasted": 0.6, "speed": 13.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 54, "sequential_reasoning": 98, "error_recovery": 94, "data_gap_recovery": 80, "data_gap_recovery_extended": 0, "argument_transformation": 28, "grounded_synthesis": 20, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 98, "data_gap_recovery_stateful": 90, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 96}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 27, "sequential_reasoning": 49, "error_recovery": 47, "data_gap_recovery": 40, "data_gap_recovery_extended": 0, "argument_transformation": 14, "grounded_synthesis": 10, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 20, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 48}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 108, "sequential_reasoning": 196, "error_recovery": 94, "data_gap_recovery": 200, "data_gap_recovery_extended": 0, "argument_transformation": 70, "grounded_synthesis": 100, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 80, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 147, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 60, "inconsistent_api_recovery_stateful": 384}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 152, "sequential_reasoning": 196, "error_recovery": 144, "data_gap_recovery": 224, "data_gap_recovery_extended": 0, "argument_transformation": 57, "grounded_synthesis": 116, "inconsistent_api_recovery": 578, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 120, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 151, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 69, "inconsistent_api_recovery_stateful": 586}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 57.0, "sequential_reasoning": 0.0, "error_recovery": 56.0, "data_gap_recovery": 33.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 40.0, "inconsistent_api_recovery": 209.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 51.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 4.0, "data_gap_recovery_stateful": 30.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 42.0, "inconsistent_api_recovery_stateful": 205.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 53.47, "argument_fidelity": 169.21, "tool_selection": 160.08, "basic_2step": 80.99, "sequential_3step": 154.22, "conditional_routing": 712.09, "sequential_reasoning": 294.43, "error_recovery": 514.35, "data_gap_recovery": 821.02, "data_gap_recovery_extended": 855.21, "argument_transformation": 1420.99, "grounded_synthesis": 1305.16, "inconsistent_api_recovery": 2097.35, "relevance_detection_stateful": 59.26, "argument_fidelity_stateful": 177.13, "tool_selection_stateful": 160.58, "basic_2step_stateful": 85.42, "sequential_3step_stateful": 154.44, "conditional_routing_stateful": 710.05, "sequential_reasoning_stateful": 313.35, "error_recovery_stateful": 499.64, "data_gap_recovery_stateful": 819.53, "data_gap_recovery_extended_stateful": 848.87, "argument_transformation_stateful": 1495.73, "grounded_synthesis_stateful": 1225.17, "inconsistent_api_recovery_stateful": 2044.46}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "Qwen3-8B-Q4_K_M LS/P [reforged:keep-last]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "keep-last", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 72.2, "accuracy": 72.3, "completeness": 99.9, "efficiency": 87.6, "wasted": 0.5, "speed": 17.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 62, "data_gap_recovery": 66, "data_gap_recovery_extended": 0, "argument_transformation": 30, "grounded_synthesis": 8, "inconsistent_api_recovery": 90, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 74, "data_gap_recovery_stateful": 68, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 74}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 31, "data_gap_recovery": 33, "data_gap_recovery_extended": 0, "argument_transformation": 15, "grounded_synthesis": 4, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 34, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 37}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 62, "data_gap_recovery": 165, "data_gap_recovery_extended": 0, "argument_transformation": 75, "grounded_synthesis": 40, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 111, "data_gap_recovery_stateful": 170, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 296}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 170, "basic_2step": 102, "sequential_3step": 150, "conditional_routing": 254, "sequential_reasoning": 200, "error_recovery": 93, "data_gap_recovery": 157, "data_gap_recovery_extended": 0, "argument_transformation": 59, "grounded_synthesis": 33, "inconsistent_api_recovery": 549, "relevance_detection_stateful": 51, "argument_fidelity_stateful": 150, "tool_selection_stateful": 172, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 252, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 111, "data_gap_recovery_stateful": 156, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 448}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 20.0, "basic_2step": 2.0, "sequential_3step": 0.0, "conditional_routing": 54.0, "sequential_reasoning": 0.0, "error_recovery": 49.0, "data_gap_recovery": 5.0, "data_gap_recovery_extended": 7.0, "argument_transformation": 0.0, "grounded_synthesis": 1.0, "inconsistent_api_recovery": 201.0, "relevance_detection_stateful": 1.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 22.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 56.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 6.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 172.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 74.56, "argument_fidelity": 190.39, "tool_selection": 336.03, "basic_2step": 174.14, "sequential_3step": 412.54, "conditional_routing": 938.53, "sequential_reasoning": 343.47, "error_recovery": 409.0, "data_gap_recovery": 731.32, "data_gap_recovery_extended": 1067.5, "argument_transformation": 1761.79, "grounded_synthesis": 1735.67, "inconsistent_api_recovery": 3466.13, "relevance_detection_stateful": 74.68, "argument_fidelity_stateful": 190.94, "tool_selection_stateful": 301.91, "basic_2step_stateful": 237.39, "sequential_3step_stateful": 437.14, "conditional_routing_stateful": 932.82, "sequential_reasoning_stateful": 354.43, "error_recovery_stateful": 409.26, "data_gap_recovery_stateful": 694.61, "data_gap_recovery_extended_stateful": 1096.8, "argument_transformation_stateful": 1770.74, "grounded_synthesis_stateful": 1896.14, "inconsistent_api_recovery_stateful": 3207.72}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q8_0 LS/P [reforged]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "none", "family": "qwen3-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 72.0, "accuracy": 72.3, "completeness": 99.6, "efficiency": 88.2, "wasted": 0.4, "speed": 28.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 100, "error_recovery": 56, "data_gap_recovery": 90, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 10, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 96, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 58, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 62}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 28, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 5, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 29, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 31}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 200, "error_recovery": 56, "data_gap_recovery": 225, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 50, "inconsistent_api_recovery": 384, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 87, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 248}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 155, "basic_2step": 100, "sequential_3step": 151, "conditional_routing": 249, "sequential_reasoning": 200, "error_recovery": 85, "data_gap_recovery": 232, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 48, "inconsistent_api_recovery": 538, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 153, "basic_2step_stateful": 100, "sequential_3step_stateful": 145, "conditional_routing_stateful": 258, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 87, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 98, "inconsistent_api_recovery_stateful": 373}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 5.0, "basic_2step": 0.0, "sequential_3step": 1.0, "conditional_routing": 57.0, "sequential_reasoning": 0.0, "error_recovery": 51.0, "data_gap_recovery": 8.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 0.0, "grounded_synthesis": 5.0, "inconsistent_api_recovery": 155.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 3.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 5.0, "conditional_routing_stateful": 62.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 12.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 4.0, "inconsistent_api_recovery_stateful": 163.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 79.42, "argument_fidelity": 302.97, "tool_selection": 381.07, "basic_2step": 273.96, "sequential_3step": 641.0, "conditional_routing": 1287.14, "sequential_reasoning": 483.62, "error_recovery": 690.11, "data_gap_recovery": 974.74, "data_gap_recovery_extended": 1577.2, "argument_transformation": 2813.58, "grounded_synthesis": 3269.35, "inconsistent_api_recovery": 5457.05, "relevance_detection_stateful": 77.94, "argument_fidelity_stateful": 297.84, "tool_selection_stateful": 357.93, "basic_2step_stateful": 305.03, "sequential_3step_stateful": 751.2, "conditional_routing_stateful": 1278.03, "sequential_reasoning_stateful": 486.78, "error_recovery_stateful": 689.7, "data_gap_recovery_stateful": 973.72, "data_gap_recovery_extended_stateful": 1532.99, "argument_transformation_stateful": 2820.13, "grounded_synthesis_stateful": 3896.59, "inconsistent_api_recovery_stateful": 5324.27}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-14B-Q4_K_M LS/P [reforged:full]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 71.8, "accuracy": 71.9, "completeness": 99.8, "efficiency": 86.5, "wasted": 0.5, "speed": 24.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 100, "error_recovery": 72, "data_gap_recovery": 72, "data_gap_recovery_extended": 2, "argument_transformation": 0, "grounded_synthesis": 30, "inconsistent_api_recovery": 74, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 92, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 74, "data_gap_recovery_stateful": 68, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 48}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 36, "data_gap_recovery": 36, "data_gap_recovery_extended": 1, "argument_transformation": 0, "grounded_synthesis": 15, "inconsistent_api_recovery": 37, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 34, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 19, "inconsistent_api_recovery_stateful": 24}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 200, "error_recovery": 72, "data_gap_recovery": 180, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 150, "inconsistent_api_recovery": 296, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 184, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 111, "data_gap_recovery_stateful": 170, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 190, "inconsistent_api_recovery_stateful": 192}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 187, "basic_2step": 104, "sequential_3step": 152, "conditional_routing": 266, "sequential_reasoning": 200, "error_recovery": 108, "data_gap_recovery": 172, "data_gap_recovery_extended": 6, "argument_transformation": 0, "grounded_synthesis": 193, "inconsistent_api_recovery": 389, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 190, "basic_2step_stateful": 105, "sequential_3step_stateful": 150, "conditional_routing_stateful": 255, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 111, "data_gap_recovery_stateful": 163, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 249, "inconsistent_api_recovery_stateful": 265}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 37.0, "basic_2step": 4.0, "sequential_3step": 2.0, "conditional_routing": 75.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 77.0, "inconsistent_api_recovery": 102.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 40.0, "basic_2step_stateful": 5.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 71.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 91.0, "inconsistent_api_recovery_stateful": 92.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 79.73, "argument_fidelity": 315.72, "tool_selection": 494.15, "basic_2step": 161.09, "sequential_3step": 529.5, "conditional_routing": 1317.74, "sequential_reasoning": 477.44, "error_recovery": 542.71, "data_gap_recovery": 870.21, "data_gap_recovery_extended": 1299.36, "argument_transformation": 2493.38, "grounded_synthesis": 3869.65, "inconsistent_api_recovery": 3456.95, "relevance_detection_stateful": 79.2, "argument_fidelity_stateful": 309.72, "tool_selection_stateful": 515.98, "basic_2step_stateful": 173.35, "sequential_3step_stateful": 475.86, "conditional_routing_stateful": 1282.67, "sequential_reasoning_stateful": 472.98, "error_recovery_stateful": 471.57, "data_gap_recovery_stateful": 811.37, "data_gap_recovery_extended_stateful": 1338.74, "argument_transformation_stateful": 2384.47, "grounded_synthesis_stateful": 4135.56, "inconsistent_api_recovery_stateful": 3142.24}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-14B-Q4_K_M LS/P [reforged:keep-last]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "keep-last", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 71.8, "accuracy": 71.8, "completeness": 100.0, "efficiency": 85.8, "wasted": 0.5, "speed": 23.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 98, "sequential_reasoning": 100, "error_recovery": 72, "data_gap_recovery": 58, "data_gap_recovery_extended": 2, "argument_transformation": 4, "grounded_synthesis": 28, "inconsistent_api_recovery": 72, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 94, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 76, "data_gap_recovery_stateful": 74, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 56}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 36, "data_gap_recovery": 29, "data_gap_recovery_extended": 1, "argument_transformation": 2, "grounded_synthesis": 14, "inconsistent_api_recovery": 36, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 38, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 28}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 196, "sequential_reasoning": 200, "error_recovery": 72, "data_gap_recovery": 145, "data_gap_recovery_extended": 8, "argument_transformation": 10, "grounded_synthesis": 140, "inconsistent_api_recovery": 288, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 188, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 114, "data_gap_recovery_stateful": 185, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 160, "inconsistent_api_recovery_stateful": 224}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 189, "basic_2step": 103, "sequential_3step": 152, "conditional_routing": 264, "sequential_reasoning": 200, "error_recovery": 108, "data_gap_recovery": 137, "data_gap_recovery_extended": 7, "argument_transformation": 8, "grounded_synthesis": 196, "inconsistent_api_recovery": 363, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 191, "basic_2step_stateful": 107, "sequential_3step_stateful": 150, "conditional_routing_stateful": 263, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 115, "data_gap_recovery_stateful": 177, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 221, "inconsistent_api_recovery_stateful": 329}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 39.0, "basic_2step": 3.0, "sequential_3step": 2.0, "conditional_routing": 68.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 78.0, "inconsistent_api_recovery": 96.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 41.0, "basic_2step_stateful": 7.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 77.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 1.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 79.0, "inconsistent_api_recovery_stateful": 118.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 77.8, "argument_fidelity": 302.45, "tool_selection": 482.84, "basic_2step": 155.72, "sequential_3step": 456.2, "conditional_routing": 1254.8, "sequential_reasoning": 461.61, "error_recovery": 575.03, "data_gap_recovery": 878.59, "data_gap_recovery_extended": 1304.81, "argument_transformation": 2394.48, "grounded_synthesis": 3617.57, "inconsistent_api_recovery": 3312.15, "relevance_detection_stateful": 77.4, "argument_fidelity_stateful": 302.44, "tool_selection_stateful": 496.8, "basic_2step_stateful": 169.61, "sequential_3step_stateful": 474.09, "conditional_routing_stateful": 1308.71, "sequential_reasoning_stateful": 461.41, "error_recovery_stateful": 565.01, "data_gap_recovery_stateful": 788.65, "data_gap_recovery_extended_stateful": 1319.92, "argument_transformation_stateful": 2426.65, "grounded_synthesis_stateful": 3731.52, "inconsistent_api_recovery_stateful": 3590.01}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/P [reforged]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "none", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 72.4, "accuracy": 72.4, "completeness": 99.9, "efficiency": 85.1, "wasted": 0.6, "speed": 8.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 56, "sequential_reasoning": 96, "error_recovery": 100, "data_gap_recovery": 86, "data_gap_recovery_extended": 0, "argument_transformation": 14, "grounded_synthesis": 28, "inconsistent_api_recovery": 84, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 96, "data_gap_recovery_stateful": 74, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 22, "inconsistent_api_recovery_stateful": 92}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 28, "sequential_reasoning": 48, "error_recovery": 50, "data_gap_recovery": 43, "data_gap_recovery_extended": 0, "argument_transformation": 7, "grounded_synthesis": 14, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 17, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 11, "inconsistent_api_recovery_stateful": 46}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 112, "sequential_reasoning": 192, "error_recovery": 100, "data_gap_recovery": 215, "data_gap_recovery_extended": 0, "argument_transformation": 35, "grounded_synthesis": 140, "inconsistent_api_recovery": 336, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 68, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 144, "data_gap_recovery_stateful": 185, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 110, "inconsistent_api_recovery_stateful": 368}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 131, "sequential_reasoning": 192, "error_recovery": 151, "data_gap_recovery": 229, "data_gap_recovery_extended": 0, "argument_transformation": 28, "grounded_synthesis": 198, "inconsistent_api_recovery": 510, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 95, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 145, "data_gap_recovery_stateful": 197, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 152, "inconsistent_api_recovery_stateful": 571}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 20.0, "sequential_reasoning": 0.0, "error_recovery": 51.0, "data_gap_recovery": 20.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 112.0, "inconsistent_api_recovery": 199.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 28.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 2.0, "data_gap_recovery_stateful": 21.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 128.0, "inconsistent_api_recovery_stateful": 216.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 37.32, "argument_fidelity": 91.44, "tool_selection": 104.36, "basic_2step": 48.13, "sequential_3step": 104.42, "conditional_routing": 469.35, "sequential_reasoning": 188.48, "error_recovery": 220.52, "data_gap_recovery": 526.76, "data_gap_recovery_extended": 483.58, "argument_transformation": 1205.41, "grounded_synthesis": 923.38, "inconsistent_api_recovery": 1381.85, "relevance_detection_stateful": 32.12, "argument_fidelity_stateful": 96.07, "tool_selection_stateful": 112.51, "basic_2step_stateful": 52.34, "sequential_3step_stateful": 106.82, "conditional_routing_stateful": 476.33, "sequential_reasoning_stateful": 188.13, "error_recovery_stateful": 185.23, "data_gap_recovery_stateful": 527.65, "data_gap_recovery_extended_stateful": 461.82, "argument_transformation_stateful": 1196.54, "grounded_synthesis_stateful": 917.62, "inconsistent_api_recovery_stateful": 1420.65}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/N [reforged:full]", "model": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "mistral-small-3.2", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 71.0, "accuracy": 71.2, "completeness": 99.8, "efficiency": 95.6, "wasted": 0.5, "speed": 6.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 58, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 68, "data_gap_recovery_extended": 4, "argument_transformation": 12, "grounded_synthesis": 20, "inconsistent_api_recovery": 90, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 22, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 100}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 29, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 34, "data_gap_recovery_extended": 2, "argument_transformation": 6, "grounded_synthesis": 10, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 25, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 50}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 116, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 170, "data_gap_recovery_extended": 16, "argument_transformation": 30, "grounded_synthesis": 100, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 88, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 400}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 116, "sequential_reasoning": 257, "error_recovery": 153, "data_gap_recovery": 273, "data_gap_recovery_extended": 7, "argument_transformation": 21, "grounded_synthesis": 62, "inconsistent_api_recovery": 315, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 99, "sequential_reasoning_stateful": 252, "error_recovery_stateful": 152, "data_gap_recovery_stateful": 21, "data_gap_recovery_extended_stateful": 62, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 51, "inconsistent_api_recovery_stateful": 352}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 57.0, "error_recovery": 53.0, "data_gap_recovery": 141.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 5.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 52.0, "error_recovery_stateful": 2.0, "data_gap_recovery_stateful": 187.0, "data_gap_recovery_extended_stateful": 7.0, "argument_transformation_stateful": 2.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 55.24, "argument_fidelity": 88.3, "tool_selection": 57.73, "basic_2step": 40.14, "sequential_3step": 85.65, "conditional_routing": 381.28, "sequential_reasoning": 377.74, "error_recovery": 201.02, "data_gap_recovery": 673.06, "data_gap_recovery_extended": 487.57, "argument_transformation": 724.09, "grounded_synthesis": 765.34, "inconsistent_api_recovery": 484.16, "relevance_detection_stateful": 60.08, "argument_fidelity_stateful": 88.18, "tool_selection_stateful": 57.76, "basic_2step_stateful": 49.02, "sequential_3step_stateful": 85.15, "conditional_routing_stateful": 356.47, "sequential_reasoning_stateful": 382.39, "error_recovery_stateful": 154.74, "data_gap_recovery_stateful": 374.2, "data_gap_recovery_extended_stateful": 546.5, "argument_transformation_stateful": 681.13, "grounded_synthesis_stateful": 734.01, "inconsistent_api_recovery_stateful": 471.27}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q4_K_M LS/P [reforged:full]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 70.5, "accuracy": 70.8, "completeness": 99.6, "efficiency": 87.8, "wasted": 0.4, "speed": 17.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 88, "sequential_reasoning": 100, "error_recovery": 58, "data_gap_recovery": 66, "data_gap_recovery_extended": 0, "argument_transformation": 24, "grounded_synthesis": 10, "inconsistent_api_recovery": 88, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 94, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 62, "data_gap_recovery_stateful": 66, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 76}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 44, "sequential_reasoning": 50, "error_recovery": 29, "data_gap_recovery": 33, "data_gap_recovery_extended": 0, "argument_transformation": 12, "grounded_synthesis": 5, "inconsistent_api_recovery": 44, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 31, "data_gap_recovery_stateful": 33, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 38}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 48, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 48, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 176, "sequential_reasoning": 200, "error_recovery": 58, "data_gap_recovery": 165, "data_gap_recovery_extended": 0, "argument_transformation": 60, "grounded_synthesis": 50, "inconsistent_api_recovery": 352, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 188, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 93, "data_gap_recovery_stateful": 165, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 304}, "scenarioActualCalls": {"relevance_detection": 51, "argument_fidelity": 150, "tool_selection": 172, "basic_2step": 104, "sequential_3step": 150, "conditional_routing": 230, "sequential_reasoning": 200, "error_recovery": 89, "data_gap_recovery": 155, "data_gap_recovery_extended": 0, "argument_transformation": 48, "grounded_synthesis": 37, "inconsistent_api_recovery": 501, "relevance_detection_stateful": 51, "argument_fidelity_stateful": 150, "tool_selection_stateful": 178, "basic_2step_stateful": 101, "sequential_3step_stateful": 150, "conditional_routing_stateful": 238, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 93, "data_gap_recovery_stateful": 153, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 15, "inconsistent_api_recovery_stateful": 464}, "scenarioWastedSum": {"relevance_detection": 1.0, "argument_fidelity": 0.0, "tool_selection": 22.0, "basic_2step": 4.0, "sequential_3step": 0.0, "conditional_routing": 54.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 2.0, "argument_transformation": 0.0, "grounded_synthesis": 1.0, "inconsistent_api_recovery": 155.0, "relevance_detection_stateful": 1.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 28.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 6.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 1.0, "inconsistent_api_recovery_stateful": 179.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 48, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 82.63, "argument_fidelity": 197.39, "tool_selection": 328.91, "basic_2step": 185.86, "sequential_3step": 425.14, "conditional_routing": 950.54, "sequential_reasoning": 360.14, "error_recovery": 457.38, "data_gap_recovery": 755.93, "data_gap_recovery_extended": 1068.24, "argument_transformation": 1567.4, "grounded_synthesis": 1516.27, "inconsistent_api_recovery": 3213.61, "relevance_detection_stateful": 77.96, "argument_fidelity_stateful": 191.87, "tool_selection_stateful": 370.72, "basic_2step_stateful": 246.72, "sequential_3step_stateful": 453.93, "conditional_routing_stateful": 938.9, "sequential_reasoning_stateful": 355.23, "error_recovery_stateful": 430.11, "data_gap_recovery_stateful": 743.79, "data_gap_recovery_extended_stateful": 1119.39, "argument_transformation_stateful": 1469.35, "grounded_synthesis_stateful": 1640.27, "inconsistent_api_recovery_stateful": 3418.15}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 48, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q4_K_M LS/P [reforged]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "none", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 71.1, "accuracy": 71.2, "completeness": 99.8, "efficiency": 86.8, "wasted": 0.5, "speed": 18.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 100, "error_recovery": 70, "data_gap_recovery": 66, "data_gap_recovery_extended": 0, "argument_transformation": 18, "grounded_synthesis": 8, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 98, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 60, "data_gap_recovery_stateful": 60, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 70}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 35, "data_gap_recovery": 33, "data_gap_recovery_extended": 0, "argument_transformation": 9, "grounded_synthesis": 4, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 30, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 35}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 200, "error_recovery": 70, "data_gap_recovery": 165, "data_gap_recovery_extended": 0, "argument_transformation": 45, "grounded_synthesis": 40, "inconsistent_api_recovery": 384, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 90, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 280}, "scenarioActualCalls": {"relevance_detection": 51, "argument_fidelity": 150, "tool_selection": 168, "basic_2step": 109, "sequential_3step": 150, "conditional_routing": 243, "sequential_reasoning": 200, "error_recovery": 105, "data_gap_recovery": 159, "data_gap_recovery_extended": 0, "argument_transformation": 36, "grounded_synthesis": 31, "inconsistent_api_recovery": 568, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 173, "basic_2step_stateful": 102, "sequential_3step_stateful": 147, "conditional_routing_stateful": 253, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 90, "data_gap_recovery_stateful": 144, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 427}, "scenarioWastedSum": {"relevance_detection": 1.0, "argument_fidelity": 0.0, "tool_selection": 18.0, "basic_2step": 9.0, "sequential_3step": 0.0, "conditional_routing": 51.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 5.0, "data_gap_recovery_extended": 3.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 194.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 23.0, "basic_2step_stateful": 2.0, "sequential_3step_stateful": 2.0, "conditional_routing_stateful": 61.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 6.0, "data_gap_recovery_extended_stateful": 6.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 1.0, "inconsistent_api_recovery_stateful": 181.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 77.95, "argument_fidelity": 198.67, "tool_selection": 301.39, "basic_2step": 216.08, "sequential_3step": 441.87, "conditional_routing": 936.49, "sequential_reasoning": 355.84, "error_recovery": 426.35, "data_gap_recovery": 728.17, "data_gap_recovery_extended": 1111.83, "argument_transformation": 1620.65, "grounded_synthesis": 1743.02, "inconsistent_api_recovery": 3606.84, "relevance_detection_stateful": 75.76, "argument_fidelity_stateful": 194.42, "tool_selection_stateful": 331.41, "basic_2step_stateful": 229.55, "sequential_3step_stateful": 456.22, "conditional_routing_stateful": 965.94, "sequential_reasoning_stateful": 359.72, "error_recovery_stateful": 465.18, "data_gap_recovery_stateful": 701.39, "data_gap_recovery_extended_stateful": 1075.05, "argument_transformation_stateful": 1722.64, "grounded_synthesis_stateful": 1753.65, "inconsistent_api_recovery_stateful": 3221.9}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "Qwen3-14B-Q4_K_M LS/P [reforged]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "none", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 71.4, "accuracy": 71.4, "completeness": 99.9, "efficiency": 86.3, "wasted": 0.5, "speed": 25.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 98, "sequential_reasoning": 100, "error_recovery": 70, "data_gap_recovery": 70, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 30, "inconsistent_api_recovery": 80, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 92, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 76, "data_gap_recovery_stateful": 56, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 22, "inconsistent_api_recovery_stateful": 62}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 35, "data_gap_recovery": 35, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 15, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 38, "data_gap_recovery_stateful": 28, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 11, "inconsistent_api_recovery_stateful": 31}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 196, "sequential_reasoning": 200, "error_recovery": 70, "data_gap_recovery": 175, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 150, "inconsistent_api_recovery": 320, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 184, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 114, "data_gap_recovery_stateful": 140, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 110, "inconsistent_api_recovery_stateful": 248}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 190, "basic_2step": 105, "sequential_3step": 152, "conditional_routing": 264, "sequential_reasoning": 200, "error_recovery": 106, "data_gap_recovery": 166, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 205, "inconsistent_api_recovery": 406, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 194, "basic_2step_stateful": 106, "sequential_3step_stateful": 151, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 114, "data_gap_recovery_stateful": 137, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 139, "inconsistent_api_recovery_stateful": 347}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 40.0, "basic_2step": 5.0, "sequential_3step": 2.0, "conditional_routing": 68.0, "sequential_reasoning": 0.0, "error_recovery": 52.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 2.0, "argument_transformation": 0.0, "grounded_synthesis": 87.0, "inconsistent_api_recovery": 95.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 44.0, "basic_2step_stateful": 6.0, "sequential_3step_stateful": 1.0, "conditional_routing_stateful": 67.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 83.0, "inconsistent_api_recovery_stateful": 127.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 81.05, "argument_fidelity": 324.55, "tool_selection": 530.61, "basic_2step": 167.7, "sequential_3step": 512.13, "conditional_routing": 1316.5, "sequential_reasoning": 501.11, "error_recovery": 579.99, "data_gap_recovery": 865.69, "data_gap_recovery_extended": 1420.72, "argument_transformation": 2533.53, "grounded_synthesis": 4093.34, "inconsistent_api_recovery": 3618.01, "relevance_detection_stateful": 81.35, "argument_fidelity_stateful": 325.97, "tool_selection_stateful": 528.97, "basic_2step_stateful": 176.84, "sequential_3step_stateful": 529.7, "conditional_routing_stateful": 1322.47, "sequential_reasoning_stateful": 495.61, "error_recovery_stateful": 549.94, "data_gap_recovery_stateful": 880.14, "data_gap_recovery_extended_stateful": 1332.68, "argument_transformation_stateful": 2484.41, "grounded_synthesis_stateful": 4049.85, "inconsistent_api_recovery_stateful": 3894.2}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "ministral-3:8b-instruct-2512-q8_0 OL/N [reforged:full]", "model": "ministral-3:8b-instruct-2512-q8_0", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "ministral-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 70.7, "accuracy": 74.5, "completeness": 94.9, "efficiency": 73.6, "wasted": 1.1, "speed": 5.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 92, "data_gap_recovery_extended": 12, "argument_transformation": 42, "grounded_synthesis": 0, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 26, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 12}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 46, "data_gap_recovery_extended": 6, "argument_transformation": 21, "grounded_synthesis": 0, "inconsistent_api_recovery": 21, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 13, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 6}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 21, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 15}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 21, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 15}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 230, "data_gap_recovery_extended": 48, "argument_transformation": 105, "grounded_synthesis": 0, "inconsistent_api_recovery": 168, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 65, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 48}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 200, "tool_selection": 200, "basic_2step": 150, "sequential_3step": 164, "conditional_routing": 260, "sequential_reasoning": 250, "error_recovery": 150, "data_gap_recovery": 347, "data_gap_recovery_extended": 50, "argument_transformation": 142, "grounded_synthesis": 0, "inconsistent_api_recovery": 296, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 200, "tool_selection_stateful": 200, "basic_2step_stateful": 150, "sequential_3step_stateful": 158, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 114, "data_gap_recovery_extended_stateful": 17, "argument_transformation_stateful": 25, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 87}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 50.0, "tool_selection": 50.0, "basic_2step": 50.0, "sequential_3step": 14.0, "conditional_routing": 60.0, "sequential_reasoning": 50.0, "error_recovery": 50.0, "data_gap_recovery": 129.0, "data_gap_recovery_extended": 4.0, "argument_transformation": 98.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 128.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 50.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 8.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 134.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 84.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 82.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 21, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 15}, "scenarioSpeedSum": {"relevance_detection": 51.13, "argument_fidelity": 111.05, "tool_selection": 87.09, "basic_2step": 53.1, "sequential_3step": 74.19, "conditional_routing": 224.51, "sequential_reasoning": 259.91, "error_recovery": 115.73, "data_gap_recovery": 657.06, "data_gap_recovery_extended": 498.6, "argument_transformation": 727.93, "grounded_synthesis": 566.4, "inconsistent_api_recovery": 386.35, "relevance_detection_stateful": 51.32, "argument_fidelity_stateful": 110.79, "tool_selection_stateful": 86.38, "basic_2step_stateful": 47.76, "sequential_3step_stateful": 62.47, "conditional_routing_stateful": 196.98, "sequential_reasoning_stateful": 259.06, "error_recovery_stateful": 89.73, "data_gap_recovery_stateful": 533.38, "data_gap_recovery_extended_stateful": 489.69, "argument_transformation_stateful": 681.24, "grounded_synthesis_stateful": 579.42, "inconsistent_api_recovery_stateful": 269.45}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 21, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 15}}, {"label": "Nemotron-3-Nano-30B-A3B-Q4_K_M LS/N [reforged:full]", "model": "Nemotron-3-Nano-30B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "nemotron-3-nano", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 71.3, "accuracy": 81.0, "completeness": 88.0, "efficiency": 72.3, "wasted": 1.5, "speed": 21.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 66, "sequential_reasoning": 98, "error_recovery": 52, "data_gap_recovery": 92, "data_gap_recovery_extended": 28, "argument_transformation": 4, "grounded_synthesis": 34, "inconsistent_api_recovery": 34, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 98, "sequential_3step_stateful": 100, "conditional_routing_stateful": 86, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 68, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 38}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 33, "sequential_reasoning": 49, "error_recovery": 26, "data_gap_recovery": 46, "data_gap_recovery_extended": 14, "argument_transformation": 2, "grounded_synthesis": 17, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 49, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 34, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 17, "inconsistent_api_recovery_stateful": 19}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 36, "data_gap_recovery": 46, "data_gap_recovery_extended": 50, "argument_transformation": 19, "grounded_synthesis": 49, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 20}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 36, "data_gap_recovery": 46, "data_gap_recovery_extended": 50, "argument_transformation": 19, "grounded_synthesis": 49, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 20}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 132, "sequential_reasoning": 196, "error_recovery": 52, "data_gap_recovery": 230, "data_gap_recovery_extended": 112, "argument_transformation": 10, "grounded_synthesis": 170, "inconsistent_api_recovery": 136, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 98, "sequential_3step_stateful": 150, "conditional_routing_stateful": 172, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 102, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 96, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 170, "inconsistent_api_recovery_stateful": 152}, "scenarioActualCalls": {"relevance_detection": 92, "argument_fidelity": 151, "tool_selection": 160, "basic_2step": 127, "sequential_3step": 157, "conditional_routing": 174, "sequential_reasoning": 302, "error_recovery": 84, "data_gap_recovery": 325, "data_gap_recovery_extended": 126, "argument_transformation": 22, "grounded_synthesis": 276, "inconsistent_api_recovery": 277, "relevance_detection_stateful": 85, "argument_fidelity_stateful": 150, "tool_selection_stateful": 155, "basic_2step_stateful": 124, "sequential_3step_stateful": 153, "conditional_routing_stateful": 233, "sequential_reasoning_stateful": 283, "error_recovery_stateful": 113, "data_gap_recovery_stateful": 345, "data_gap_recovery_extended_stateful": 117, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 292, "inconsistent_api_recovery_stateful": 300}, "scenarioWastedSum": {"relevance_detection": 42.0, "argument_fidelity": 1.0, "tool_selection": 10.0, "basic_2step": 27.0, "sequential_3step": 7.0, "conditional_routing": 47.0, "sequential_reasoning": 106.0, "error_recovery": 44.0, "data_gap_recovery": 95.0, "data_gap_recovery_extended": 79.0, "argument_transformation": 100.0, "grounded_synthesis": 196.0, "inconsistent_api_recovery": 141.0, "relevance_detection_stateful": 35.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 5.0, "basic_2step_stateful": 26.0, "sequential_3step_stateful": 3.0, "conditional_routing_stateful": 61.0, "sequential_reasoning_stateful": 102.0, "error_recovery_stateful": 16.0, "data_gap_recovery_stateful": 100.0, "data_gap_recovery_extended_stateful": 76.0, "argument_transformation_stateful": 106.0, "grounded_synthesis_stateful": 190.0, "inconsistent_api_recovery_stateful": 155.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 36, "data_gap_recovery": 46, "data_gap_recovery_extended": 50, "argument_transformation": 19, "grounded_synthesis": 49, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 20}, "scenarioSpeedSum": {"relevance_detection": 126.82, "argument_fidelity": 178.91, "tool_selection": 228.93, "basic_2step": 245.29, "sequential_3step": 244.92, "conditional_routing": 560.68, "sequential_reasoning": 1373.58, "error_recovery": 422.83, "data_gap_recovery": 1448.98, "data_gap_recovery_extended": 1628.44, "argument_transformation": 2589.83, "grounded_synthesis": 1477.52, "inconsistent_api_recovery": 1578.41, "relevance_detection_stateful": 124.08, "argument_fidelity_stateful": 181.64, "tool_selection_stateful": 207.7, "basic_2step_stateful": 191.81, "sequential_3step_stateful": 221.1, "conditional_routing_stateful": 589.65, "sequential_reasoning_stateful": 1295.08, "error_recovery_stateful": 568.14, "data_gap_recovery_stateful": 1337.71, "data_gap_recovery_extended_stateful": 1842.25, "argument_transformation_stateful": 2618.18, "grounded_synthesis_stateful": 1327.51, "inconsistent_api_recovery_stateful": 1893.77}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 36, "data_gap_recovery": 46, "data_gap_recovery_extended": 50, "argument_transformation": 19, "grounded_synthesis": 49, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 20}}, {"label": "Nemotron-3-Nano-30B-A3B-Q4_K_M LS/P [reforged:full]", "model": "Nemotron-3-Nano-30B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "nemotron-3-nano", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 70.2, "accuracy": 70.7, "completeness": 99.4, "efficiency": 88.7, "wasted": 0.4, "speed": 10.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 52, "sequential_reasoning": 100, "error_recovery": 84, "data_gap_recovery": 90, "data_gap_recovery_extended": 6, "argument_transformation": 4, "grounded_synthesis": 0, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 80, "data_gap_recovery_stateful": 92, "data_gap_recovery_extended_stateful": 6, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 66}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 26, "sequential_reasoning": 50, "error_recovery": 42, "data_gap_recovery": 45, "data_gap_recovery_extended": 3, "argument_transformation": 2, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 40, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 33}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 104, "sequential_reasoning": 200, "error_recovery": 84, "data_gap_recovery": 225, "data_gap_recovery_extended": 24, "argument_transformation": 10, "grounded_synthesis": 0, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 84, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 120, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 264}, "scenarioActualCalls": {"relevance_detection": 51, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 101, "sequential_3step": 149, "conditional_routing": 141, "sequential_reasoning": 200, "error_recovery": 126, "data_gap_recovery": 225, "data_gap_recovery_extended": 23, "argument_transformation": 8, "grounded_synthesis": 0, "inconsistent_api_recovery": 566, "relevance_detection_stateful": 51, "argument_fidelity_stateful": 150, "tool_selection_stateful": 151, "basic_2step_stateful": 108, "sequential_3step_stateful": 155, "conditional_routing_stateful": 117, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 120, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 21, "inconsistent_api_recovery_stateful": 379}, "scenarioWastedSum": {"relevance_detection": 1.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 1.0, "sequential_3step": 2.0, "conditional_routing": 44.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 0.0, "grounded_synthesis": 2.0, "inconsistent_api_recovery": 166.0, "relevance_detection_stateful": 1.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 1.0, "basic_2step_stateful": 8.0, "sequential_3step_stateful": 5.0, "conditional_routing_stateful": 39.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 7.0, "inconsistent_api_recovery_stateful": 152.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 70.04, "argument_fidelity": 132.41, "tool_selection": 117.18, "basic_2step": 269.63, "sequential_3step": 199.88, "conditional_routing": 470.04, "sequential_reasoning": 278.36, "error_recovery": 270.9, "data_gap_recovery": 588.34, "data_gap_recovery_extended": 910.47, "argument_transformation": 886.85, "grounded_synthesis": 650.18, "inconsistent_api_recovery": 2156.16, "relevance_detection_stateful": 68.15, "argument_fidelity_stateful": 132.9, "tool_selection_stateful": 119.51, "basic_2step_stateful": 365.62, "sequential_3step_stateful": 210.28, "conditional_routing_stateful": 437.49, "sequential_reasoning_stateful": 306.75, "error_recovery_stateful": 290.9, "data_gap_recovery_stateful": 630.65, "data_gap_recovery_extended_stateful": 742.67, "argument_transformation_stateful": 830.77, "grounded_synthesis_stateful": 658.65, "inconsistent_api_recovery_stateful": 2145.56}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-14B-Q4_K_M LS/N [reforged]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "none", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 68.5, "accuracy": 68.5, "completeness": 100.0, "efficiency": 100.0, "wasted": 0.4, "speed": 21.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 56, "data_gap_recovery": 72, "data_gap_recovery_extended": 14, "argument_transformation": 4, "grounded_synthesis": 58, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 72, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 52, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 28, "data_gap_recovery": 36, "data_gap_recovery_extended": 7, "argument_transformation": 2, "grounded_synthesis": 29, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 21, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 56, "data_gap_recovery": 180, "data_gap_recovery_extended": 56, "argument_transformation": 10, "grounded_synthesis": 290, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 63, "data_gap_recovery_stateful": 180, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 260, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 197, "basic_2step": 167, "sequential_3step": 175, "conditional_routing": 194, "sequential_reasoning": 200, "error_recovery": 115, "data_gap_recovery": 140, "data_gap_recovery_extended": 28, "argument_transformation": 9, "grounded_synthesis": 144, "inconsistent_api_recovery": 17, "relevance_detection_stateful": 51, "argument_fidelity_stateful": 150, "tool_selection_stateful": 201, "basic_2step_stateful": 154, "sequential_3step_stateful": 179, "conditional_routing_stateful": 188, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 86, "data_gap_recovery_stateful": 136, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 134, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 47.0, "basic_2step": 67.0, "sequential_3step": 29.0, "conditional_routing": 27.0, "sequential_reasoning": 0.0, "error_recovery": 108.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 3.0, "relevance_detection_stateful": 1.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 51.0, "basic_2step_stateful": 54.0, "sequential_3step_stateful": 29.0, "conditional_routing_stateful": 28.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 57.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 8.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 90.79, "argument_fidelity": 336.4, "tool_selection": 547.61, "basic_2step": 496.72, "sequential_3step": 877.64, "conditional_routing": 1045.25, "sequential_reasoning": 887.36, "error_recovery": 711.78, "data_gap_recovery": 764.43, "data_gap_recovery_extended": 1199.38, "argument_transformation": 3362.58, "grounded_synthesis": 2349.54, "inconsistent_api_recovery": 1535.04, "relevance_detection_stateful": 98.26, "argument_fidelity_stateful": 331.44, "tool_selection_stateful": 548.71, "basic_2step_stateful": 445.59, "sequential_3step_stateful": 856.57, "conditional_routing_stateful": 996.05, "sequential_reasoning_stateful": 843.14, "error_recovery_stateful": 781.02, "data_gap_recovery_stateful": 724.93, "data_gap_recovery_extended_stateful": 1128.35, "argument_transformation_stateful": 3621.42, "grounded_synthesis_stateful": 2334.88, "inconsistent_api_recovery_stateful": 1385.84}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q8_0 LS/N [reforged:full]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "qwen3-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 69.3, "accuracy": 69.6, "completeness": 99.6, "efficiency": 88.3, "wasted": 0.6, "speed": 24.7, "n": 50, "scenarios": {"relevance_detection": 98, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 100, "error_recovery": 48, "data_gap_recovery": 76, "data_gap_recovery_extended": 2, "argument_transformation": 28, "grounded_synthesis": 24, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 46, "data_gap_recovery_stateful": 76, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 40}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 24, "data_gap_recovery": 38, "data_gap_recovery_extended": 1, "argument_transformation": 14, "grounded_synthesis": 12, "inconsistent_api_recovery": 23, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 23, "data_gap_recovery_stateful": 38, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 20}, "scenarioCompleted": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 200, "error_recovery": 48, "data_gap_recovery": 190, "data_gap_recovery_extended": 8, "argument_transformation": 70, "grounded_synthesis": 120, "inconsistent_api_recovery": 184, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 69, "data_gap_recovery_stateful": 190, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 80, "inconsistent_api_recovery_stateful": 160}, "scenarioActualCalls": {"relevance_detection": 54, "argument_fidelity": 168, "tool_selection": 201, "basic_2step": 109, "sequential_3step": 159, "conditional_routing": 237, "sequential_reasoning": 213, "error_recovery": 88, "data_gap_recovery": 163, "data_gap_recovery_extended": 3, "argument_transformation": 100, "grounded_synthesis": 62, "inconsistent_api_recovery": 257, "relevance_detection_stateful": 58, "argument_fidelity_stateful": 171, "tool_selection_stateful": 201, "basic_2step_stateful": 107, "sequential_3step_stateful": 154, "conditional_routing_stateful": 245, "sequential_reasoning_stateful": 212, "error_recovery_stateful": 85, "data_gap_recovery_stateful": 187, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 227}, "scenarioWastedSum": {"relevance_detection": 5.0, "argument_fidelity": 18.0, "tool_selection": 51.0, "basic_2step": 9.0, "sequential_3step": 9.0, "conditional_routing": 49.0, "sequential_reasoning": 13.0, "error_recovery": 91.0, "data_gap_recovery": 13.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 63.0, "grounded_synthesis": 5.0, "inconsistent_api_recovery": 79.0, "relevance_detection_stateful": 8.0, "argument_fidelity_stateful": 21.0, "tool_selection_stateful": 51.0, "basic_2step_stateful": 7.0, "sequential_3step_stateful": 4.0, "conditional_routing_stateful": 51.0, "sequential_reasoning_stateful": 12.0, "error_recovery_stateful": 44.0, "data_gap_recovery_stateful": 25.0, "data_gap_recovery_extended_stateful": 7.0, "argument_transformation_stateful": 69.0, "grounded_synthesis_stateful": 1.0, "inconsistent_api_recovery_stateful": 87.0}, "scenarioWastedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 189.83, "argument_fidelity": 561.07, "tool_selection": 490.97, "basic_2step": 248.43, "sequential_3step": 881.86, "conditional_routing": 1399.31, "sequential_reasoning": 839.1, "error_recovery": 1175.36, "data_gap_recovery": 1059.82, "data_gap_recovery_extended": 1381.18, "argument_transformation": 2737.43, "grounded_synthesis": 2141.66, "inconsistent_api_recovery": 2900.74, "relevance_detection_stateful": 197.11, "argument_fidelity_stateful": 535.84, "tool_selection_stateful": 503.44, "basic_2step_stateful": 301.59, "sequential_3step_stateful": 835.16, "conditional_routing_stateful": 1400.0, "sequential_reasoning_stateful": 851.94, "error_recovery_stateful": 1016.4, "data_gap_recovery_stateful": 1117.27, "data_gap_recovery_extended_stateful": 1443.31, "argument_transformation_stateful": 2715.65, "grounded_synthesis_stateful": 2136.48, "inconsistent_api_recovery_stateful": 2982.01}, "scenarioSpeedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite4.1:8b-q8_0 OL/N [reforged:full]", "model": "granite4.1:8b-q8_0", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "granite-4.1-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 69.2, "accuracy": 69.2, "completeness": 100.0, "efficiency": 83.3, "wasted": 1.1, "speed": 2.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 200, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 250, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 200, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 250, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 50.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 150.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 108.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 150.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 100.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 108.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 150.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 54.2, "argument_fidelity": 73.33, "tool_selection": 87.19, "basic_2step": 47.03, "sequential_3step": 68.13, "conditional_routing": 160.63, "sequential_reasoning": 80.65, "error_recovery": 157.08, "data_gap_recovery": 219.66, "data_gap_recovery_extended": 206.67, "argument_transformation": 181.51, "grounded_synthesis": 311.91, "inconsistent_api_recovery": 264.9, "relevance_detection_stateful": 54.2, "argument_fidelity_stateful": 73.21, "tool_selection_stateful": 87.06, "basic_2step_stateful": 43.04, "sequential_3step_stateful": 68.12, "conditional_routing_stateful": 168.26, "sequential_reasoning_stateful": 80.49, "error_recovery_stateful": 157.14, "data_gap_recovery_stateful": 219.53, "data_gap_recovery_extended_stateful": 206.58, "argument_transformation_stateful": 181.39, "grounded_synthesis_stateful": 311.75, "inconsistent_api_recovery_stateful": 264.85}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.6-27B-Q4_K_M LS/P [bare:full]", "model": "Qwen3.6-27B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "qwen3.6-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 69.0, "accuracy": 75.5, "completeness": 91.4, "efficiency": 100.0, "wasted": 0.2, "speed": 50.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 54, "grounded_synthesis": 46, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 98, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 92, "conditional_routing_stateful": 4, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 70, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 27, "grounded_synthesis": 23, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 2, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 17, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 135, "grounded_synthesis": 230, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 138, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 175, "grounded_synthesis_stateful": 170, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 179, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 244, "data_gap_recovery_extended": 0, "argument_transformation": 109, "grounded_synthesis": 263, "inconsistent_api_recovery": 257, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 140, "conditional_routing_stateful": 9, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 237, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 145, "grounded_synthesis_stateful": 185, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 1.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 5.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 2.0, "grounded_synthesis": 91.0, "inconsistent_api_recovery": 15.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 2.0, "conditional_routing_stateful": 1.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 5.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 3.0, "grounded_synthesis_stateful": 73.0, "inconsistent_api_recovery_stateful": 16.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 460.81, "argument_fidelity": 557.39, "tool_selection": 426.26, "basic_2step": 516.92, "sequential_3step": 805.49, "conditional_routing": 1188.14, "sequential_reasoning": 946.99, "error_recovery": 0.0, "data_gap_recovery": 2437.53, "data_gap_recovery_extended": 3008.39, "argument_transformation": 5834.95, "grounded_synthesis": 5097.73, "inconsistent_api_recovery": 7928.95, "relevance_detection_stateful": 343.5, "argument_fidelity_stateful": 545.73, "tool_selection_stateful": 435.01, "basic_2step_stateful": 849.63, "sequential_3step_stateful": 1131.85, "conditional_routing_stateful": 1242.6, "sequential_reasoning_stateful": 893.73, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 2500.74, "data_gap_recovery_extended_stateful": 3252.47, "argument_transformation_stateful": 5730.32, "grounded_synthesis_stateful": 5259.62, "inconsistent_api_recovery_stateful": 8617.78}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 46, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}}, {"label": "qwen3:8b-q8_0 OL/N [reforged:full]", "model": "qwen3:8b-q8_0", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "qwen3-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 67.5, "accuracy": 67.6, "completeness": 99.9, "efficiency": 85.1, "wasted": 0.6, "speed": 31.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 26, "data_gap_recovery": 88, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 6, "inconsistent_api_recovery": 66, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 40, "data_gap_recovery_stateful": 82, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 44}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 13, "data_gap_recovery": 44, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 3, "inconsistent_api_recovery": 33, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 20, "data_gap_recovery_stateful": 41, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 22}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 26, "data_gap_recovery": 220, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 30, "inconsistent_api_recovery": 264, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 60, "data_gap_recovery_stateful": 205, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 176}, "scenarioActualCalls": {"relevance_detection": 56, "argument_fidelity": 150, "tool_selection": 202, "basic_2step": 133, "sequential_3step": 155, "conditional_routing": 252, "sequential_reasoning": 200, "error_recovery": 39, "data_gap_recovery": 229, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 26, "inconsistent_api_recovery": 379, "relevance_detection_stateful": 57, "argument_fidelity_stateful": 150, "tool_selection_stateful": 201, "basic_2step_stateful": 125, "sequential_3step_stateful": 154, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 62, "data_gap_recovery_stateful": 221, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 264}, "scenarioWastedSum": {"relevance_detection": 6.0, "argument_fidelity": 0.0, "tool_selection": 52.0, "basic_2step": 33.0, "sequential_3step": 5.0, "conditional_routing": 52.0, "sequential_reasoning": 0.0, "error_recovery": 88.0, "data_gap_recovery": 14.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 3.0, "grounded_synthesis": 1.0, "inconsistent_api_recovery": 170.0, "relevance_detection_stateful": 7.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 51.0, "basic_2step_stateful": 25.0, "sequential_3step_stateful": 4.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 33.0, "data_gap_recovery_stateful": 24.0, "data_gap_recovery_extended_stateful": 3.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 3.0, "inconsistent_api_recovery_stateful": 183.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 160.09, "argument_fidelity": 379.27, "tool_selection": 722.42, "basic_2step": 454.24, "sequential_3step": 674.39, "conditional_routing": 1498.29, "sequential_reasoning": 507.33, "error_recovery": 724.42, "data_gap_recovery": 1028.09, "data_gap_recovery_extended": 1690.29, "argument_transformation": 3008.72, "grounded_synthesis": 2852.96, "inconsistent_api_recovery": 6197.34, "relevance_detection_stateful": 161.17, "argument_fidelity_stateful": 420.92, "tool_selection_stateful": 749.62, "basic_2step_stateful": 491.27, "sequential_3step_stateful": 744.41, "conditional_routing_stateful": 1621.71, "sequential_reasoning_stateful": 586.99, "error_recovery_stateful": 714.69, "data_gap_recovery_stateful": 1199.1, "data_gap_recovery_extended_stateful": 1843.54, "argument_transformation_stateful": 3016.37, "grounded_synthesis_stateful": 2514.35, "inconsistent_api_recovery_stateful": 6354.36}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-14B-Q4_K_M LS/N [reforged:full]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 68.4, "accuracy": 68.4, "completeness": 99.9, "efficiency": 83.4, "wasted": 0.9, "speed": 21.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 90, "sequential_reasoning": 98, "error_recovery": 60, "data_gap_recovery": 32, "data_gap_recovery_extended": 20, "argument_transformation": 18, "grounded_synthesis": 50, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 86, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 74, "data_gap_recovery_stateful": 34, "data_gap_recovery_extended_stateful": 6, "argument_transformation_stateful": 18, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 22}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 45, "sequential_reasoning": 49, "error_recovery": 30, "data_gap_recovery": 16, "data_gap_recovery_extended": 10, "argument_transformation": 9, "grounded_synthesis": 25, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 9, "grounded_synthesis_stateful": 17, "inconsistent_api_recovery_stateful": 11}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 180, "sequential_reasoning": 196, "error_recovery": 60, "data_gap_recovery": 80, "data_gap_recovery_extended": 80, "argument_transformation": 45, "grounded_synthesis": 250, "inconsistent_api_recovery": 152, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 172, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 111, "data_gap_recovery_stateful": 85, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 170, "inconsistent_api_recovery_stateful": 88}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 228, "tool_selection": 221, "basic_2step": 175, "sequential_3step": 180, "conditional_routing": 204, "sequential_reasoning": 236, "error_recovery": 130, "data_gap_recovery": 60, "data_gap_recovery_extended": 38, "argument_transformation": 58, "grounded_synthesis": 171, "inconsistent_api_recovery": 224, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 229, "tool_selection_stateful": 218, "basic_2step_stateful": 149, "sequential_3step_stateful": 185, "conditional_routing_stateful": 176, "sequential_reasoning_stateful": 251, "error_recovery_stateful": 158, "data_gap_recovery_stateful": 66, "data_gap_recovery_extended_stateful": 15, "argument_transformation_stateful": 58, "grounded_synthesis_stateful": 97, "inconsistent_api_recovery_stateful": 133}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 78.0, "tool_selection": 71.0, "basic_2step": 75.0, "sequential_3step": 34.0, "conditional_routing": 57.0, "sequential_reasoning": 40.0, "error_recovery": 123.0, "data_gap_recovery": 9.0, "data_gap_recovery_extended": 2.0, "argument_transformation": 63.0, "grounded_synthesis": 42.0, "inconsistent_api_recovery": 73.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 79.0, "tool_selection_stateful": 68.0, "basic_2step_stateful": 49.0, "sequential_3step_stateful": 35.0, "conditional_routing_stateful": 52.0, "sequential_reasoning_stateful": 51.0, "error_recovery_stateful": 66.0, "data_gap_recovery_stateful": 8.0, "data_gap_recovery_extended_stateful": 4.0, "argument_transformation_stateful": 90.0, "grounded_synthesis_stateful": 15.0, "inconsistent_api_recovery_stateful": 48.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 97.96, "argument_fidelity": 610.75, "tool_selection": 557.07, "basic_2step": 393.42, "sequential_3step": 762.25, "conditional_routing": 1140.05, "sequential_reasoning": 957.0, "error_recovery": 650.29, "data_gap_recovery": 839.83, "data_gap_recovery_extended": 1081.33, "argument_transformation": 2870.04, "grounded_synthesis": 2169.78, "inconsistent_api_recovery": 2339.98, "relevance_detection_stateful": 98.58, "argument_fidelity_stateful": 614.41, "tool_selection_stateful": 530.26, "basic_2step_stateful": 316.49, "sequential_3step_stateful": 759.06, "conditional_routing_stateful": 1105.83, "sequential_reasoning_stateful": 984.61, "error_recovery_stateful": 613.02, "data_gap_recovery_stateful": 815.52, "data_gap_recovery_extended_stateful": 1097.35, "argument_transformation_stateful": 2980.64, "grounded_synthesis_stateful": 2088.97, "inconsistent_api_recovery_stateful": 2014.76}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q8_0 LS/N [reforged]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "none", "family": "qwen3-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 68.2, "accuracy": 68.5, "completeness": 99.5, "efficiency": 95.1, "wasted": 0.3, "speed": 24.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 56, "data_gap_recovery": 78, "data_gap_recovery_extended": 6, "argument_transformation": 24, "grounded_synthesis": 28, "inconsistent_api_recovery": 6, "relevance_detection_stateful": 98, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 52, "data_gap_recovery_stateful": 84, "data_gap_recovery_extended_stateful": 6, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 2}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 28, "data_gap_recovery": 39, "data_gap_recovery_extended": 3, "argument_transformation": 12, "grounded_synthesis": 14, "inconsistent_api_recovery": 3, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 26, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 15, "inconsistent_api_recovery_stateful": 1}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 47, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 47, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 56, "data_gap_recovery": 195, "data_gap_recovery_extended": 24, "argument_transformation": 60, "grounded_synthesis": 140, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 78, "data_gap_recovery_stateful": 210, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 8}, "scenarioActualCalls": {"relevance_detection": 54, "argument_fidelity": 150, "tool_selection": 201, "basic_2step": 118, "sequential_3step": 156, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 105, "data_gap_recovery": 182, "data_gap_recovery_extended": 12, "argument_transformation": 47, "grounded_synthesis": 74, "inconsistent_api_recovery": 22, "relevance_detection_stateful": 57, "argument_fidelity_stateful": 150, "tool_selection_stateful": 197, "basic_2step_stateful": 135, "sequential_3step_stateful": 157, "conditional_routing_stateful": 251, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 95, "data_gap_recovery_stateful": 203, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 81, "inconsistent_api_recovery_stateful": 12}, "scenarioWastedSum": {"relevance_detection": 4.0, "argument_fidelity": 0.0, "tool_selection": 51.0, "basic_2step": 18.0, "sequential_3step": 6.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 88.0, "data_gap_recovery": 8.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 4.0, "relevance_detection_stateful": 8.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 47.0, "basic_2step_stateful": 35.0, "sequential_3step_stateful": 7.0, "conditional_routing_stateful": 51.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 43.0, "data_gap_recovery_stateful": 14.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 5.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 47, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 187.41, "argument_fidelity": 458.95, "tool_selection": 723.65, "basic_2step": 327.85, "sequential_3step": 887.15, "conditional_routing": 1486.27, "sequential_reasoning": 732.14, "error_recovery": 1164.44, "data_gap_recovery": 1124.85, "data_gap_recovery_extended": 1337.8, "argument_transformation": 3190.43, "grounded_synthesis": 2205.5, "inconsistent_api_recovery": 2221.72, "relevance_detection_stateful": 193.17, "argument_fidelity_stateful": 474.87, "tool_selection_stateful": 639.59, "basic_2step_stateful": 417.13, "sequential_3step_stateful": 794.98, "conditional_routing_stateful": 1501.06, "sequential_reasoning_stateful": 642.73, "error_recovery_stateful": 1187.63, "data_gap_recovery_stateful": 1254.88, "data_gap_recovery_extended_stateful": 1391.18, "argument_transformation_stateful": 2995.89, "grounded_synthesis_stateful": 2279.8, "inconsistent_api_recovery_stateful": 2181.14}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 47, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-Nemo-Instruct-2407-Q4_K_M LS/P [reforged:full]", "model": "Mistral-Nemo-Instruct-2407-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 68.2, "accuracy": 75.6, "completeness": 90.2, "efficiency": 93.9, "wasted": 1.0, "speed": 3.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 0, "conditional_routing": 100, "sequential_reasoning": 64, "error_recovery": 100, "data_gap_recovery": 90, "data_gap_recovery_extended": 80, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 60, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 94, "data_gap_recovery_extended_stateful": 70, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 32, "error_recovery": 50, "data_gap_recovery": 45, "data_gap_recovery_extended": 40, "argument_transformation": 0, "grounded_synthesis": 25, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 30, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 35, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 36, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 38, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 36, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 38, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 0, "conditional_routing": 200, "sequential_reasoning": 128, "error_recovery": 100, "data_gap_recovery": 225, "data_gap_recovery_extended": 320, "argument_transformation": 0, "grounded_synthesis": 250, "inconsistent_api_recovery": 64, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 120, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 280, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 240, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 158, "tool_selection": 171, "basic_2step": 100, "sequential_3step": 0, "conditional_routing": 250, "sequential_reasoning": 351, "error_recovery": 150, "data_gap_recovery": 194, "data_gap_recovery_extended": 140, "argument_transformation": 0, "grounded_synthesis": 227, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 157, "tool_selection_stateful": 168, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 248, "sequential_reasoning_stateful": 320, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 187, "data_gap_recovery_extended_stateful": 128, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 238, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 8.0, "tool_selection": 21.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 241.0, "error_recovery": 50.0, "data_gap_recovery": 21.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 7.0, "grounded_synthesis": 112.0, "inconsistent_api_recovery": 97.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 7.0, "tool_selection_stateful": 18.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 48.0, "sequential_reasoning_stateful": 243.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 6.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 15.0, "grounded_synthesis_stateful": 113.0, "inconsistent_api_recovery_stateful": 86.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 36, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 38, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 18.23, "argument_fidelity": 48.16, "tool_selection": 65.1, "basic_2step": 30.51, "sequential_3step": 0.0, "conditional_routing": 128.95, "sequential_reasoning": 124.57, "error_recovery": 34.1, "data_gap_recovery": 155.55, "data_gap_recovery_extended": 202.16, "argument_transformation": 205.06, "grounded_synthesis": 491.91, "inconsistent_api_recovery": 385.08, "relevance_detection_stateful": 18.95, "argument_fidelity_stateful": 47.74, "tool_selection_stateful": 65.76, "basic_2step_stateful": 33.48, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 128.08, "sequential_reasoning_stateful": 127.05, "error_recovery_stateful": 34.09, "data_gap_recovery_stateful": 143.22, "data_gap_recovery_extended_stateful": 194.04, "argument_transformation_stateful": 213.1, "grounded_synthesis_stateful": 509.16, "inconsistent_api_recovery_stateful": 377.19}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 36, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 38, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/N [bare:full]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 67.9, "accuracy": 77.5, "completeness": 87.7, "efficiency": 100.0, "wasted": 0.2, "speed": 9.5, "n": 50, "scenarios": {"relevance_detection": 96, "argument_fidelity": 98, "tool_selection": 78, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 84, "error_recovery": 0, "data_gap_recovery": 92, "data_gap_recovery_extended": 2, "argument_transformation": 24, "grounded_synthesis": 80, "inconsistent_api_recovery": 80, "relevance_detection_stateful": 98, "argument_fidelity_stateful": 100, "tool_selection_stateful": 78, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 92, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 94, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 84, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 48, "argument_fidelity": 49, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 42, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 1, "argument_transformation": 12, "grounded_synthesis": 40, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 48, "argument_fidelity": 49, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 43, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 48, "argument_fidelity": 49, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 43, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 48, "argument_fidelity": 147, "tool_selection": 117, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 168, "error_recovery": 0, "data_gap_recovery": 230, "data_gap_recovery_extended": 8, "argument_transformation": 60, "grounded_synthesis": 400, "inconsistent_api_recovery": 320, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 150, "tool_selection_stateful": 117, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 184, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 420, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 48, "argument_fidelity": 150, "tool_selection": 121, "basic_2step": 102, "sequential_3step": 157, "conditional_routing": 203, "sequential_reasoning": 175, "error_recovery": 0, "data_gap_recovery": 214, "data_gap_recovery_extended": 6, "argument_transformation": 47, "grounded_synthesis": 200, "inconsistent_api_recovery": 290, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 155, "tool_selection_stateful": 124, "basic_2step_stateful": 101, "sequential_3step_stateful": 159, "conditional_routing_stateful": 208, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 223, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 225, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 3.0, "tool_selection": 4.0, "basic_2step": 2.0, "sequential_3step": 7.0, "conditional_routing": 23.0, "sequential_reasoning": 8.0, "error_recovery": 0.0, "data_gap_recovery": 15.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 52.0, "inconsistent_api_recovery": 13.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 5.0, "tool_selection_stateful": 7.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 9.0, "conditional_routing_stateful": 29.0, "sequential_reasoning_stateful": 8.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 19.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 44.0, "inconsistent_api_recovery_stateful": 2.0}, "scenarioWastedN": {"relevance_detection": 48, "argument_fidelity": 49, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 43, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 80.15, "argument_fidelity": 165.13, "tool_selection": 109.3, "basic_2step": 68.67, "sequential_3step": 155.72, "conditional_routing": 482.87, "sequential_reasoning": 288.73, "error_recovery": 0.0, "data_gap_recovery": 654.15, "data_gap_recovery_extended": 651.05, "argument_transformation": 887.48, "grounded_synthesis": 761.02, "inconsistent_api_recovery": 1215.56, "relevance_detection_stateful": 75.13, "argument_fidelity_stateful": 140.43, "tool_selection_stateful": 112.43, "basic_2step_stateful": 70.82, "sequential_3step_stateful": 162.5, "conditional_routing_stateful": 480.09, "sequential_reasoning_stateful": 278.78, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 625.1, "data_gap_recovery_extended_stateful": 619.73, "argument_transformation_stateful": 814.89, "grounded_synthesis_stateful": 755.11, "inconsistent_api_recovery_stateful": 1136.81}, "scenarioSpeedN": {"relevance_detection": 48, "argument_fidelity": 49, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 43, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 49}}, {"label": "Qwen3-8B-Q4_K_M LS/N [reforged]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "none", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 67.3, "accuracy": 67.5, "completeness": 99.7, "efficiency": 95.6, "wasted": 0.3, "speed": 15.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 40, "data_gap_recovery": 98, "data_gap_recovery_extended": 6, "argument_transformation": 14, "grounded_synthesis": 22, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 86, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 6}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 20, "data_gap_recovery": 49, "data_gap_recovery_extended": 3, "argument_transformation": 7, "grounded_synthesis": 11, "inconsistent_api_recovery": 1, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 24, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 13, "inconsistent_api_recovery_stateful": 3}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 40, "data_gap_recovery": 245, "data_gap_recovery_extended": 24, "argument_transformation": 35, "grounded_synthesis": 110, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 72, "data_gap_recovery_stateful": 215, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 130, "inconsistent_api_recovery_stateful": 24}, "scenarioActualCalls": {"relevance_detection": 57, "argument_fidelity": 150, "tool_selection": 200, "basic_2step": 119, "sequential_3step": 157, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 76, "data_gap_recovery": 222, "data_gap_recovery_extended": 16, "argument_transformation": 28, "grounded_synthesis": 65, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 53, "argument_fidelity_stateful": 150, "tool_selection_stateful": 200, "basic_2step_stateful": 102, "sequential_3step_stateful": 156, "conditional_routing_stateful": 251, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 82, "data_gap_recovery_stateful": 197, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 72, "inconsistent_api_recovery_stateful": 28}, "scenarioWastedSum": {"relevance_detection": 7.0, "argument_fidelity": 0.0, "tool_selection": 50.0, "basic_2step": 19.0, "sequential_3step": 7.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 94.0, "data_gap_recovery": 11.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 3.0, "relevance_detection_stateful": 3.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 2.0, "sequential_3step_stateful": 6.0, "conditional_routing_stateful": 51.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 44.0, "data_gap_recovery_stateful": 14.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 6.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 66.0, "argument_fidelity": 282.85, "tool_selection": 481.21, "basic_2step": 207.37, "sequential_3step": 446.7, "conditional_routing": 1060.62, "sequential_reasoning": 525.3, "error_recovery": 621.99, "data_gap_recovery": 817.59, "data_gap_recovery_extended": 999.09, "argument_transformation": 1888.46, "grounded_synthesis": 1478.43, "inconsistent_api_recovery": 1320.93, "relevance_detection_stateful": 60.12, "argument_fidelity_stateful": 282.39, "tool_selection_stateful": 488.12, "basic_2step_stateful": 147.11, "sequential_3step_stateful": 437.07, "conditional_routing_stateful": 1034.13, "sequential_reasoning_stateful": 549.76, "error_recovery_stateful": 752.47, "data_gap_recovery_stateful": 824.26, "data_gap_recovery_extended_stateful": 920.07, "argument_transformation_stateful": 1732.17, "grounded_synthesis_stateful": 1372.17, "inconsistent_api_recovery_stateful": 1424.73}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q8_0 LS/N [reforged:keep-last]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "keep-last", "family": "qwen3-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 67.0, "accuracy": 67.2, "completeness": 99.8, "efficiency": 92.4, "wasted": 0.4, "speed": 23.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 98, "error_recovery": 48, "data_gap_recovery": 84, "data_gap_recovery_extended": 2, "argument_transformation": 22, "grounded_synthesis": 10, "inconsistent_api_recovery": 12, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 52, "data_gap_recovery_stateful": 80, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 18, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 6}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 49, "error_recovery": 24, "data_gap_recovery": 42, "data_gap_recovery_extended": 1, "argument_transformation": 11, "grounded_synthesis": 5, "inconsistent_api_recovery": 6, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 26, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 9, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 3}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 196, "error_recovery": 48, "data_gap_recovery": 210, "data_gap_recovery_extended": 8, "argument_transformation": 55, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 78, "data_gap_recovery_stateful": 200, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 24}, "scenarioActualCalls": {"relevance_detection": 58, "argument_fidelity": 167, "tool_selection": 204, "basic_2step": 113, "sequential_3step": 153, "conditional_routing": 235, "sequential_reasoning": 209, "error_recovery": 88, "data_gap_recovery": 186, "data_gap_recovery_extended": 4, "argument_transformation": 60, "grounded_synthesis": 29, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 59, "argument_fidelity_stateful": 166, "tool_selection_stateful": 202, "basic_2step_stateful": 103, "sequential_3step_stateful": 158, "conditional_routing_stateful": 242, "sequential_reasoning_stateful": 207, "error_recovery_stateful": 90, "data_gap_recovery_stateful": 178, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 52, "inconsistent_api_recovery_stateful": 22}, "scenarioWastedSum": {"relevance_detection": 8.0, "argument_fidelity": 17.0, "tool_selection": 54.0, "basic_2step": 13.0, "sequential_3step": 3.0, "conditional_routing": 47.0, "sequential_reasoning": 13.0, "error_recovery": 91.0, "data_gap_recovery": 11.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 16.0, "grounded_synthesis": 1.0, "inconsistent_api_recovery": 7.0, "relevance_detection_stateful": 9.0, "argument_fidelity_stateful": 16.0, "tool_selection_stateful": 52.0, "basic_2step_stateful": 3.0, "sequential_3step_stateful": 8.0, "conditional_routing_stateful": 51.0, "sequential_reasoning_stateful": 7.0, "error_recovery_stateful": 36.0, "data_gap_recovery_stateful": 9.0, "data_gap_recovery_extended_stateful": 6.0, "argument_transformation_stateful": 11.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 11.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 222.38, "argument_fidelity": 540.53, "tool_selection": 571.38, "basic_2step": 273.19, "sequential_3step": 807.93, "conditional_routing": 1407.12, "sequential_reasoning": 750.0, "error_recovery": 1060.01, "data_gap_recovery": 1116.19, "data_gap_recovery_extended": 1377.5, "argument_transformation": 2865.67, "grounded_synthesis": 2116.07, "inconsistent_api_recovery": 2084.68, "relevance_detection_stateful": 183.42, "argument_fidelity_stateful": 529.21, "tool_selection_stateful": 515.65, "basic_2step_stateful": 293.03, "sequential_3step_stateful": 897.31, "conditional_routing_stateful": 1367.47, "sequential_reasoning_stateful": 790.31, "error_recovery_stateful": 974.95, "data_gap_recovery_stateful": 1070.56, "data_gap_recovery_extended_stateful": 1373.65, "argument_transformation_stateful": 2591.96, "grounded_synthesis_stateful": 2045.45, "inconsistent_api_recovery_stateful": 2251.16}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "ministral-3:8b-instruct-2512-q4_K_M OL/N [reforged:full]", "model": "ministral-3:8b-instruct-2512-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "ministral-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 66.8, "accuracy": 71.9, "completeness": 92.9, "efficiency": 67.7, "wasted": 1.4, "speed": 5.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 76, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 68, "data_gap_recovery": 90, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 4, "inconsistent_api_recovery": 64, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 28, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 80, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 22}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 34, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 14, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 40, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 11}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 14, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 41, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 14, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 41, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 114, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 68, "data_gap_recovery": 225, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 20, "inconsistent_api_recovery": 256, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 42, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 120, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 88}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 200, "tool_selection": 200, "basic_2step": 150, "sequential_3step": 118, "conditional_routing": 258, "sequential_reasoning": 250, "error_recovery": 194, "data_gap_recovery": 315, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 18, "inconsistent_api_recovery": 504, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 200, "tool_selection_stateful": 200, "basic_2step_stateful": 150, "sequential_3step_stateful": 42, "conditional_routing_stateful": 260, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 252, "data_gap_recovery_stateful": 344, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 165}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 50.0, "tool_selection": 50.0, "basic_2step": 50.0, "sequential_3step": 4.0, "conditional_routing": 58.0, "sequential_reasoning": 50.0, "error_recovery": 151.0, "data_gap_recovery": 90.0, "data_gap_recovery_extended": 33.0, "argument_transformation": 18.0, "grounded_synthesis": 2.0, "inconsistent_api_recovery": 248.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 50.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 60.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 137.0, "data_gap_recovery_stateful": 101.0, "data_gap_recovery_extended_stateful": 18.0, "argument_transformation_stateful": 25.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 241.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 14, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 41, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioSpeedSum": {"relevance_detection": 32.52, "argument_fidelity": 79.72, "tool_selection": 64.14, "basic_2step": 37.52, "sequential_3step": 55.8, "conditional_routing": 290.91, "sequential_reasoning": 137.39, "error_recovery": 234.75, "data_gap_recovery": 356.68, "data_gap_recovery_extended": 581.33, "argument_transformation": 345.16, "grounded_synthesis": 473.91, "inconsistent_api_recovery": 447.82, "relevance_detection_stateful": 32.54, "argument_fidelity_stateful": 80.5, "tool_selection_stateful": 64.1, "basic_2step_stateful": 32.0, "sequential_3step_stateful": 16.24, "conditional_routing_stateful": 245.28, "sequential_reasoning_stateful": 138.62, "error_recovery_stateful": 310.54, "data_gap_recovery_stateful": 357.36, "data_gap_recovery_extended_stateful": 505.19, "argument_transformation_stateful": 372.93, "grounded_synthesis_stateful": 563.91, "inconsistent_api_recovery_stateful": 649.25}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 47, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 14, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 41, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/N [bare]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "none", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 67.5, "accuracy": 73.5, "completeness": 91.8, "efficiency": 100.0, "wasted": 0.3, "speed": 7.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 88, "sequential_reasoning": 96, "error_recovery": 0, "data_gap_recovery": 90, "data_gap_recovery_extended": 2, "argument_transformation": 28, "grounded_synthesis": 44, "inconsistent_api_recovery": 70, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 82, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 62, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 44, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 1, "argument_transformation": 14, "grounded_synthesis": 22, "inconsistent_api_recovery": 35, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 41, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 176, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 225, "data_gap_recovery_extended": 8, "argument_transformation": 70, "grounded_synthesis": 220, "inconsistent_api_recovery": 280, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 164, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 310, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 210, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 218, "data_gap_recovery_extended": 5, "argument_transformation": 55, "grounded_synthesis": 186, "inconsistent_api_recovery": 150, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 201, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 197, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 339, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 36.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 11.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 8.0, "grounded_synthesis": 66.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 37.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 7.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 10.0, "grounded_synthesis_stateful": 139.0, "inconsistent_api_recovery_stateful": 2.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 85.04, "argument_fidelity": 91.61, "tool_selection": 87.87, "basic_2step": 50.33, "sequential_3step": 96.06, "conditional_routing": 336.5, "sequential_reasoning": 167.5, "error_recovery": 0.0, "data_gap_recovery": 455.36, "data_gap_recovery_extended": 515.63, "argument_transformation": 1063.69, "grounded_synthesis": 671.54, "inconsistent_api_recovery": 842.37, "relevance_detection_stateful": 86.97, "argument_fidelity_stateful": 89.99, "tool_selection_stateful": 87.76, "basic_2step_stateful": 45.67, "sequential_3step_stateful": 78.49, "conditional_routing_stateful": 334.52, "sequential_reasoning_stateful": 164.86, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 431.11, "data_gap_recovery_extended_stateful": 531.28, "argument_transformation_stateful": 1078.1, "grounded_synthesis_stateful": 740.74, "inconsistent_api_recovery_stateful": 837.8}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q8_0 LS/N [bare:keep-last]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "keep-last", "family": "gemma4-e4b", "quant": "q8_0", "gen": 3, "retired": false, "score": 67.0, "accuracy": 75.4, "completeness": 88.8, "efficiency": 100.0, "wasted": 0.2, "speed": 14.0, "n": 50, "scenarios": {"relevance_detection": 96, "argument_fidelity": 100, "tool_selection": 72, "basic_2step": 98, "sequential_3step": 100, "conditional_routing": 74, "sequential_reasoning": 96, "error_recovery": 0, "data_gap_recovery": 96, "data_gap_recovery_extended": 0, "argument_transformation": 28, "grounded_synthesis": 78, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 98, "argument_fidelity_stateful": 100, "tool_selection_stateful": 74, "basic_2step_stateful": 100, "sequential_3step_stateful": 98, "conditional_routing_stateful": 74, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 90, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 84, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 48, "argument_fidelity": 50, "tool_selection": 36, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 37, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 0, "argument_transformation": 14, "grounded_synthesis": 39, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 37, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 37, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 48, "argument_fidelity": 50, "tool_selection": 36, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 37, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioValidated": {"relevance_detection": 48, "argument_fidelity": 50, "tool_selection": 36, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 37, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioIdealCalls": {"relevance_detection": 48, "argument_fidelity": 150, "tool_selection": 108, "basic_2step": 98, "sequential_3step": 150, "conditional_routing": 148, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 240, "data_gap_recovery_extended": 0, "argument_transformation": 70, "grounded_synthesis": 390, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 150, "tool_selection_stateful": 111, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 148, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 420, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 48, "argument_fidelity": 154, "tool_selection": 109, "basic_2step": 100, "sequential_3step": 161, "conditional_routing": 155, "sequential_reasoning": 206, "error_recovery": 0, "data_gap_recovery": 217, "data_gap_recovery_extended": 0, "argument_transformation": 58, "grounded_synthesis": 187, "inconsistent_api_recovery": 396, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 152, "tool_selection_stateful": 112, "basic_2step_stateful": 100, "sequential_3step_stateful": 161, "conditional_routing_stateful": 160, "sequential_reasoning_stateful": 198, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 211, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 186, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 4.0, "tool_selection": 1.0, "basic_2step": 2.0, "sequential_3step": 11.0, "conditional_routing": 14.0, "sequential_reasoning": 15.0, "error_recovery": 0.0, "data_gap_recovery": 10.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 5.0, "grounded_synthesis": 18.0, "inconsistent_api_recovery": 70.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 2.0, "tool_selection_stateful": 1.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 14.0, "conditional_routing_stateful": 20.0, "sequential_reasoning_stateful": 14.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 13.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 17.0, "inconsistent_api_recovery_stateful": 43.0}, "scenarioWastedN": {"relevance_detection": 48, "argument_fidelity": 50, "tool_selection": 36, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 37, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioSpeedSum": {"relevance_detection": 110.84, "argument_fidelity": 229.42, "tool_selection": 150.91, "basic_2step": 90.4, "sequential_3step": 240.04, "conditional_routing": 629.06, "sequential_reasoning": 415.21, "error_recovery": 0.0, "data_gap_recovery": 960.73, "data_gap_recovery_extended": 976.04, "argument_transformation": 1229.53, "grounded_synthesis": 1093.49, "inconsistent_api_recovery": 2091.4, "relevance_detection_stateful": 112.51, "argument_fidelity_stateful": 255.65, "tool_selection_stateful": 159.97, "basic_2step_stateful": 101.81, "sequential_3step_stateful": 258.04, "conditional_routing_stateful": 627.32, "sequential_reasoning_stateful": 418.48, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 874.37, "data_gap_recovery_extended_stateful": 1020.43, "argument_transformation_stateful": 1160.31, "grounded_synthesis_stateful": 1071.06, "inconsistent_api_recovery_stateful": 1888.78}, "scenarioSpeedN": {"relevance_detection": 48, "argument_fidelity": 50, "tool_selection": 36, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 37, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/N [bare:keep-last]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "keep-last", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 66.6, "accuracy": 75.6, "completeness": 88.1, "efficiency": 100.0, "wasted": 0.2, "speed": 9.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 68, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 86, "sequential_reasoning": 90, "error_recovery": 0, "data_gap_recovery": 96, "data_gap_recovery_extended": 0, "argument_transformation": 24, "grounded_synthesis": 78, "inconsistent_api_recovery": 84, "relevance_detection_stateful": 98, "argument_fidelity_stateful": 100, "tool_selection_stateful": 66, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 76, "sequential_reasoning_stateful": 88, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 96, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 78, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 34, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 45, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 0, "argument_transformation": 12, "grounded_synthesis": 39, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 33, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 44, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 34, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 47, "grounded_synthesis": 47, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 33, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 48}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 34, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 47, "grounded_synthesis": 47, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 33, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 48}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 102, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 172, "sequential_reasoning": 180, "error_recovery": 0, "data_gap_recovery": 240, "data_gap_recovery_extended": 0, "argument_transformation": 60, "grounded_synthesis": 390, "inconsistent_api_recovery": 336, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 150, "tool_selection_stateful": 99, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 152, "sequential_reasoning_stateful": 176, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 240, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 390, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 158, "tool_selection": 104, "basic_2step": 102, "sequential_3step": 161, "conditional_routing": 177, "sequential_reasoning": 194, "error_recovery": 0, "data_gap_recovery": 230, "data_gap_recovery_extended": 0, "argument_transformation": 40, "grounded_synthesis": 180, "inconsistent_api_recovery": 295, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 153, "tool_selection_stateful": 101, "basic_2step_stateful": 101, "sequential_3step_stateful": 158, "conditional_routing_stateful": 155, "sequential_reasoning_stateful": 186, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 214, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 167, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 8.0, "tool_selection": 2.0, "basic_2step": 2.0, "sequential_3step": 11.0, "conditional_routing": 21.0, "sequential_reasoning": 14.0, "error_recovery": 0.0, "data_gap_recovery": 17.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 4.0, "grounded_synthesis": 17.0, "inconsistent_api_recovery": 27.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 3.0, "tool_selection_stateful": 2.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 8.0, "conditional_routing_stateful": 19.0, "sequential_reasoning_stateful": 10.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 9.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 8.0, "inconsistent_api_recovery_stateful": 33.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 34, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 47, "grounded_synthesis": 47, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 33, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 48}, "scenarioSpeedSum": {"relevance_detection": 78.68, "argument_fidelity": 139.46, "tool_selection": 99.73, "basic_2step": 70.98, "sequential_3step": 167.96, "conditional_routing": 422.6, "sequential_reasoning": 312.6, "error_recovery": 0.0, "data_gap_recovery": 593.35, "data_gap_recovery_extended": 609.28, "argument_transformation": 864.24, "grounded_synthesis": 769.02, "inconsistent_api_recovery": 1253.19, "relevance_detection_stateful": 83.26, "argument_fidelity_stateful": 141.34, "tool_selection_stateful": 90.62, "basic_2step_stateful": 73.86, "sequential_3step_stateful": 160.1, "conditional_routing_stateful": 443.89, "sequential_reasoning_stateful": 275.86, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 631.07, "data_gap_recovery_extended_stateful": 619.44, "argument_transformation_stateful": 806.09, "grounded_synthesis_stateful": 689.37, "inconsistent_api_recovery_stateful": 1223.6}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 34, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 47, "grounded_synthesis": 47, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 33, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 48}}, {"label": "Qwen3-8B-Q4_K_M LS/N [reforged:full]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 65.8, "accuracy": 66.0, "completeness": 99.7, "efficiency": 84.0, "wasted": 0.7, "speed": 17.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 100, "error_recovery": 34, "data_gap_recovery": 66, "data_gap_recovery_extended": 0, "argument_transformation": 18, "grounded_synthesis": 10, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 96, "sequential_3step_stateful": 100, "conditional_routing_stateful": 86, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 34, "data_gap_recovery_stateful": 74, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 40}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 17, "data_gap_recovery": 33, "data_gap_recovery_extended": 0, "argument_transformation": 9, "grounded_synthesis": 5, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 48, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 17, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 20}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 200, "error_recovery": 34, "data_gap_recovery": 165, "data_gap_recovery_extended": 0, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 152, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 96, "sequential_3step_stateful": 150, "conditional_routing_stateful": 172, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 51, "data_gap_recovery_stateful": 185, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 25, "grounded_synthesis_stateful": 60, "inconsistent_api_recovery_stateful": 160}, "scenarioActualCalls": {"relevance_detection": 53, "argument_fidelity": 165, "tool_selection": 202, "basic_2step": 111, "sequential_3step": 173, "conditional_routing": 236, "sequential_reasoning": 207, "error_recovery": 57, "data_gap_recovery": 154, "data_gap_recovery_extended": 0, "argument_transformation": 52, "grounded_synthesis": 75, "inconsistent_api_recovery": 246, "relevance_detection_stateful": 52, "argument_fidelity_stateful": 165, "tool_selection_stateful": 205, "basic_2step_stateful": 112, "sequential_3step_stateful": 171, "conditional_routing_stateful": 216, "sequential_reasoning_stateful": 215, "error_recovery_stateful": 56, "data_gap_recovery_stateful": 162, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 253}, "scenarioWastedSum": {"relevance_detection": 3.0, "argument_fidelity": 15.0, "tool_selection": 52.0, "basic_2step": 11.0, "sequential_3step": 23.0, "conditional_routing": 49.0, "sequential_reasoning": 7.0, "error_recovery": 93.0, "data_gap_recovery": 17.0, "data_gap_recovery_extended": 11.0, "argument_transformation": 49.0, "grounded_synthesis": 45.0, "inconsistent_api_recovery": 97.0, "relevance_detection_stateful": 2.0, "argument_fidelity_stateful": 15.0, "tool_selection_stateful": 55.0, "basic_2step_stateful": 17.0, "sequential_3step_stateful": 21.0, "conditional_routing_stateful": 45.0, "sequential_reasoning_stateful": 15.0, "error_recovery_stateful": 38.0, "data_gap_recovery_stateful": 15.0, "data_gap_recovery_extended_stateful": 8.0, "argument_transformation_stateful": 64.0, "grounded_synthesis_stateful": 12.0, "inconsistent_api_recovery_stateful": 101.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 64.29, "argument_fidelity": 342.37, "tool_selection": 364.62, "basic_2step": 175.02, "sequential_3step": 586.31, "conditional_routing": 972.05, "sequential_reasoning": 706.02, "error_recovery": 576.03, "data_gap_recovery": 737.89, "data_gap_recovery_extended": 1062.39, "argument_transformation": 1910.41, "grounded_synthesis": 1773.92, "inconsistent_api_recovery": 2028.02, "relevance_detection_stateful": 61.25, "argument_fidelity_stateful": 355.67, "tool_selection_stateful": 340.04, "basic_2step_stateful": 179.85, "sequential_3step_stateful": 586.5, "conditional_routing_stateful": 948.36, "sequential_reasoning_stateful": 650.45, "error_recovery_stateful": 589.53, "data_gap_recovery_stateful": 722.01, "data_gap_recovery_extended_stateful": 1007.49, "argument_transformation_stateful": 1931.41, "grounded_synthesis_stateful": 1585.43, "inconsistent_api_recovery_stateful": 2008.44}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 48, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q8_0 LS/N [bare]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "none", "family": "gemma4-e4b", "quant": "q8_0", "gen": 3, "retired": false, "score": 66.2, "accuracy": 72.2, "completeness": 91.7, "efficiency": 100.0, "wasted": 0.2, "speed": 10.1, "n": 50, "scenarios": {"relevance_detection": 98, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 82, "sequential_reasoning": 94, "error_recovery": 0, "data_gap_recovery": 94, "data_gap_recovery_extended": 0, "argument_transformation": 30, "grounded_synthesis": 48, "inconsistent_api_recovery": 86, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 74, "sequential_reasoning_stateful": 96, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 94, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 41, "sequential_reasoning": 47, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 0, "argument_transformation": 15, "grounded_synthesis": 24, "inconsistent_api_recovery": 43, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 37, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 13, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 164, "sequential_reasoning": 188, "error_recovery": 0, "data_gap_recovery": 235, "data_gap_recovery_extended": 0, "argument_transformation": 75, "grounded_synthesis": 240, "inconsistent_api_recovery": 344, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 148, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 130, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 194, "sequential_reasoning": 188, "error_recovery": 0, "data_gap_recovery": 219, "data_gap_recovery_extended": 0, "argument_transformation": 60, "grounded_synthesis": 196, "inconsistent_api_recovery": 184, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 178, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 126, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 31.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 9.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 51.0, "inconsistent_api_recovery": 2.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 31.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 8.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 58.0, "inconsistent_api_recovery_stateful": 1.0}, "scenarioWastedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 106.63, "argument_fidelity": 118.35, "tool_selection": 108.27, "basic_2step": 61.0, "sequential_3step": 130.78, "conditional_routing": 489.85, "sequential_reasoning": 205.42, "error_recovery": 0.0, "data_gap_recovery": 656.83, "data_gap_recovery_extended": 781.55, "argument_transformation": 1263.38, "grounded_synthesis": 949.86, "inconsistent_api_recovery": 1157.3, "relevance_detection_stateful": 107.9, "argument_fidelity_stateful": 112.19, "tool_selection_stateful": 106.56, "basic_2step_stateful": 61.34, "sequential_3step_stateful": 125.8, "conditional_routing_stateful": 478.2, "sequential_reasoning_stateful": 216.31, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 656.66, "data_gap_recovery_extended_stateful": 767.44, "argument_transformation_stateful": 1274.78, "grounded_synthesis_stateful": 909.95, "inconsistent_api_recovery_stateful": 1244.53}, "scenarioSpeedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q8_0 LS/N [bare:full]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "gemma4-e4b", "quant": "q8_0", "gen": 3, "retired": false, "score": 65.5, "accuracy": 75.3, "completeness": 87.0, "efficiency": 100.0, "wasted": 0.3, "speed": 14.4, "n": 50, "scenarios": {"relevance_detection": 96, "argument_fidelity": 96, "tool_selection": 76, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 88, "sequential_reasoning": 86, "error_recovery": 0, "data_gap_recovery": 80, "data_gap_recovery_extended": 0, "argument_transformation": 14, "grounded_synthesis": 84, "inconsistent_api_recovery": 80, "relevance_detection_stateful": 92, "argument_fidelity_stateful": 100, "tool_selection_stateful": 74, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 76, "sequential_reasoning_stateful": 88, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 86, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 48, "argument_fidelity": 48, "tool_selection": 38, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 44, "sequential_reasoning": 43, "error_recovery": 0, "data_gap_recovery": 40, "data_gap_recovery_extended": 0, "argument_transformation": 7, "grounded_synthesis": 42, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 37, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 44, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 48, "argument_fidelity": 48, "tool_selection": 38, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 37, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 48, "argument_fidelity": 48, "tool_selection": 38, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 37, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 48, "argument_fidelity": 144, "tool_selection": 114, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 176, "sequential_reasoning": 172, "error_recovery": 0, "data_gap_recovery": 200, "data_gap_recovery_extended": 0, "argument_transformation": 35, "grounded_synthesis": 420, "inconsistent_api_recovery": 320, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 150, "tool_selection_stateful": 111, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 152, "sequential_reasoning_stateful": 176, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 430, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 48, "argument_fidelity": 152, "tool_selection": 117, "basic_2step": 100, "sequential_3step": 164, "conditional_routing": 196, "sequential_reasoning": 185, "error_recovery": 0, "data_gap_recovery": 213, "data_gap_recovery_extended": 0, "argument_transformation": 24, "grounded_synthesis": 214, "inconsistent_api_recovery": 302, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 160, "tool_selection_stateful": 114, "basic_2step_stateful": 101, "sequential_3step_stateful": 165, "conditional_routing_stateful": 171, "sequential_reasoning_stateful": 191, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 201, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 8.0, "tool_selection": 3.0, "basic_2step": 0.0, "sequential_3step": 14.0, "conditional_routing": 25.0, "sequential_reasoning": 14.0, "error_recovery": 0.0, "data_gap_recovery": 33.0, "data_gap_recovery_extended": 8.0, "argument_transformation": 2.0, "grounded_synthesis": 25.0, "inconsistent_api_recovery": 21.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 10.0, "tool_selection_stateful": 3.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 15.0, "conditional_routing_stateful": 27.0, "sequential_reasoning_stateful": 15.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 25.0, "data_gap_recovery_extended_stateful": 17.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 9.0, "inconsistent_api_recovery_stateful": 14.0}, "scenarioWastedN": {"relevance_detection": 48, "argument_fidelity": 48, "tool_selection": 38, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 37, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 111.06, "argument_fidelity": 240.74, "tool_selection": 168.11, "basic_2step": 94.38, "sequential_3step": 282.86, "conditional_routing": 704.91, "sequential_reasoning": 447.44, "error_recovery": 0.0, "data_gap_recovery": 934.57, "data_gap_recovery_extended": 1142.23, "argument_transformation": 1193.19, "grounded_synthesis": 1152.24, "inconsistent_api_recovery": 1760.71, "relevance_detection_stateful": 104.08, "argument_fidelity_stateful": 233.49, "tool_selection_stateful": 154.95, "basic_2step_stateful": 104.59, "sequential_3step_stateful": 253.72, "conditional_routing_stateful": 679.12, "sequential_reasoning_stateful": 384.8, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1023.64, "data_gap_recovery_extended_stateful": 1068.38, "argument_transformation_stateful": 1100.04, "grounded_synthesis_stateful": 1077.53, "inconsistent_api_recovery_stateful": 1880.64}, "scenarioSpeedN": {"relevance_detection": 48, "argument_fidelity": 48, "tool_selection": 38, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 37, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 49}}, {"label": "Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [bare]", "model": "Ministral-3-14B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "none", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 66.5, "accuracy": 79.9, "completeness": 83.2, "efficiency": 100.0, "wasted": 0.0, "speed": 2.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 96, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 32}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 16}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 45, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 45, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 240, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 128}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 158, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 158, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 164, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 144, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 49}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 4.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 7.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 45, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 16.63, "argument_fidelity": 60.28, "tool_selection": 53.13, "basic_2step": 24.62, "sequential_3step": 70.06, "conditional_routing": 189.24, "sequential_reasoning": 111.04, "error_recovery": 0.0, "data_gap_recovery": 153.59, "data_gap_recovery_extended": 174.99, "argument_transformation": 188.47, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 280.78, "relevance_detection_stateful": 16.66, "argument_fidelity_stateful": 60.28, "tool_selection_stateful": 53.17, "basic_2step_stateful": 24.59, "sequential_3step_stateful": 70.66, "conditional_routing_stateful": 181.28, "sequential_reasoning_stateful": 106.51, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 151.97, "data_gap_recovery_extended_stateful": 178.36, "argument_transformation_stateful": 163.32, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 281.65}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 45, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [bare:full]", "model": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "mistral-small-3.2", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 65.7, "accuracy": 81.1, "completeness": 81.0, "efficiency": 79.7, "wasted": 0.9, "speed": 3.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 96, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 32, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 100, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 98, "basic_2step_stateful": 100, "sequential_3step_stateful": 98, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 16, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 23, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 20, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 50, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 27}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 50, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 27}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 144, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 80, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 500, "inconsistent_api_recovery": 184, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 144, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 80, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 900, "inconsistent_api_recovery": 151, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 101, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 887, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 32.0, "grounded_synthesis": 400.0, "inconsistent_api_recovery": 13.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 18.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 23.0, "grounded_synthesis_stateful": 392.0, "inconsistent_api_recovery_stateful": 7.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 50, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 27}, "scenarioSpeedSum": {"relevance_detection": 31.16, "argument_fidelity": 100.2, "tool_selection": 69.3, "basic_2step": 42.55, "sequential_3step": 102.09, "conditional_routing": 196.04, "sequential_reasoning": 112.68, "error_recovery": 0.0, "data_gap_recovery": 186.43, "data_gap_recovery_extended": 201.34, "argument_transformation": 31.59, "grounded_synthesis": 528.4, "inconsistent_api_recovery": 169.87, "relevance_detection_stateful": 31.21, "argument_fidelity_stateful": 98.75, "tool_selection_stateful": 70.87, "basic_2step_stateful": 62.17, "sequential_3step_stateful": 101.54, "conditional_routing_stateful": 198.61, "sequential_reasoning_stateful": 114.02, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 194.57, "data_gap_recovery_extended_stateful": 202.49, "argument_transformation_stateful": 22.81, "grounded_synthesis_stateful": 528.87, "inconsistent_api_recovery_stateful": 191.34}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 50, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 27}}, {"label": "Qwen3-8B-Q4_K_M LS/N [reforged:keep-last]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "keep-last", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 64.5, "accuracy": 64.6, "completeness": 99.9, "efficiency": 90.6, "wasted": 0.4, "speed": 15.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 30, "data_gap_recovery": 82, "data_gap_recovery_extended": 0, "argument_transformation": 28, "grounded_synthesis": 12, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 98, "sequential_3step_stateful": 100, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 22, "data_gap_recovery_stateful": 86, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 15, "data_gap_recovery": 41, "data_gap_recovery_extended": 0, "argument_transformation": 14, "grounded_synthesis": 6, "inconsistent_api_recovery": 5, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 49, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 11, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 30, "data_gap_recovery": 205, "data_gap_recovery_extended": 0, "argument_transformation": 70, "grounded_synthesis": 60, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 98, "sequential_3step_stateful": 150, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 33, "data_gap_recovery_stateful": 215, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 54, "argument_fidelity": 166, "tool_selection": 203, "basic_2step": 110, "sequential_3step": 161, "conditional_routing": 251, "sequential_reasoning": 216, "error_recovery": 50, "data_gap_recovery": 180, "data_gap_recovery_extended": 0, "argument_transformation": 62, "grounded_synthesis": 32, "inconsistent_api_recovery": 56, "relevance_detection_stateful": 60, "argument_fidelity_stateful": 168, "tool_selection_stateful": 200, "basic_2step_stateful": 113, "sequential_3step_stateful": 165, "conditional_routing_stateful": 245, "sequential_reasoning_stateful": 216, "error_recovery_stateful": 39, "data_gap_recovery_stateful": 195, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 17, "inconsistent_api_recovery_stateful": 13}, "scenarioWastedSum": {"relevance_detection": 4.0, "argument_fidelity": 16.0, "tool_selection": 53.0, "basic_2step": 10.0, "sequential_3step": 11.0, "conditional_routing": 51.0, "sequential_reasoning": 16.0, "error_recovery": 94.0, "data_gap_recovery": 14.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 6.0, "grounded_synthesis": 1.0, "inconsistent_api_recovery": 16.0, "relevance_detection_stateful": 10.0, "argument_fidelity_stateful": 18.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 16.0, "sequential_3step_stateful": 15.0, "conditional_routing_stateful": 53.0, "sequential_reasoning_stateful": 16.0, "error_recovery_stateful": 44.0, "data_gap_recovery_stateful": 19.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 10.0, "grounded_synthesis_stateful": 3.0, "inconsistent_api_recovery_stateful": 4.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 58.91, "argument_fidelity": 358.79, "tool_selection": 370.09, "basic_2step": 152.53, "sequential_3step": 502.88, "conditional_routing": 944.92, "sequential_reasoning": 658.06, "error_recovery": 605.54, "data_gap_recovery": 732.29, "data_gap_recovery_extended": 949.33, "argument_transformation": 1781.98, "grounded_synthesis": 1312.45, "inconsistent_api_recovery": 1363.86, "relevance_detection_stateful": 70.42, "argument_fidelity_stateful": 387.88, "tool_selection_stateful": 380.83, "basic_2step_stateful": 179.04, "sequential_3step_stateful": 500.53, "conditional_routing_stateful": 955.99, "sequential_reasoning_stateful": 653.9, "error_recovery_stateful": 588.13, "data_gap_recovery_stateful": 740.58, "data_gap_recovery_extended_stateful": 997.34, "argument_transformation_stateful": 1599.26, "grounded_synthesis_stateful": 1310.36, "inconsistent_api_recovery_stateful": 1271.03}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.1-8b-Q8_0 LS/N [reforged]", "model": "granite-4.1-8b-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "none", "family": "granite-4.1-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 65.4, "accuracy": 65.4, "completeness": 100.0, "efficiency": 88.2, "wasted": 1.3, "speed": 2.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 199, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 200, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 49.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 100.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 397.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 150.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 250.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 150.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 32.1, "argument_fidelity": 63.71, "tool_selection": 62.9, "basic_2step": 34.78, "sequential_3step": 53.11, "conditional_routing": 186.2, "sequential_reasoning": 81.27, "error_recovery": 54.07, "data_gap_recovery": 201.26, "data_gap_recovery_extended": 251.45, "argument_transformation": 251.94, "grounded_synthesis": 296.5, "inconsistent_api_recovery": 246.31, "relevance_detection_stateful": 33.26, "argument_fidelity_stateful": 64.76, "tool_selection_stateful": 51.97, "basic_2step_stateful": 38.57, "sequential_3step_stateful": 53.04, "conditional_routing_stateful": 203.91, "sequential_reasoning_stateful": 81.33, "error_recovery_stateful": 54.55, "data_gap_recovery_stateful": 205.88, "data_gap_recovery_extended_stateful": 252.94, "argument_transformation_stateful": 220.99, "grounded_synthesis_stateful": 292.28, "inconsistent_api_recovery_stateful": 251.31}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "qwen3:8b-q4_K_M OL/N [reforged:full]", "model": "qwen3:8b-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 64.9, "accuracy": 65.1, "completeness": 99.8, "efficiency": 84.7, "wasted": 0.6, "speed": 21.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 98, "error_recovery": 30, "data_gap_recovery": 62, "data_gap_recovery_extended": 2, "argument_transformation": 6, "grounded_synthesis": 2, "inconsistent_api_recovery": 74, "relevance_detection_stateful": 98, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 26, "data_gap_recovery_stateful": 70, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 18}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 15, "data_gap_recovery": 31, "data_gap_recovery_extended": 1, "argument_transformation": 3, "grounded_synthesis": 1, "inconsistent_api_recovery": 37, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 13, "data_gap_recovery_stateful": 35, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 9}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 196, "error_recovery": 30, "data_gap_recovery": 155, "data_gap_recovery_extended": 8, "argument_transformation": 15, "grounded_synthesis": 10, "inconsistent_api_recovery": 296, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 39, "data_gap_recovery_stateful": 175, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 72}, "scenarioActualCalls": {"relevance_detection": 70, "argument_fidelity": 150, "tool_selection": 200, "basic_2step": 187, "sequential_3step": 151, "conditional_routing": 249, "sequential_reasoning": 196, "error_recovery": 47, "data_gap_recovery": 164, "data_gap_recovery_extended": 8, "argument_transformation": 12, "grounded_synthesis": 6, "inconsistent_api_recovery": 354, "relevance_detection_stateful": 64, "argument_fidelity_stateful": 150, "tool_selection_stateful": 203, "basic_2step_stateful": 137, "sequential_3step_stateful": 150, "conditional_routing_stateful": 251, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 192, "data_gap_recovery_extended_stateful": 15, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 18, "inconsistent_api_recovery_stateful": 111}, "scenarioWastedSum": {"relevance_detection": 20.0, "argument_fidelity": 0.0, "tool_selection": 50.0, "basic_2step": 87.0, "sequential_3step": 1.0, "conditional_routing": 57.0, "sequential_reasoning": 0.0, "error_recovery": 86.0, "data_gap_recovery": 27.0, "data_gap_recovery_extended": 3.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 90.0, "relevance_detection_stateful": 15.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 53.0, "basic_2step_stateful": 37.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 55.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 42.0, "data_gap_recovery_stateful": 28.0, "data_gap_recovery_extended_stateful": 12.0, "argument_transformation_stateful": 2.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 107.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 177.68, "argument_fidelity": 313.89, "tool_selection": 515.08, "basic_2step": 346.04, "sequential_3step": 404.54, "conditional_routing": 1074.84, "sequential_reasoning": 389.12, "error_recovery": 485.21, "data_gap_recovery": 841.17, "data_gap_recovery_extended": 1241.74, "argument_transformation": 2480.48, "grounded_synthesis": 1647.12, "inconsistent_api_recovery": 3610.82, "relevance_detection_stateful": 190.1, "argument_fidelity_stateful": 297.64, "tool_selection_stateful": 533.84, "basic_2step_stateful": 248.29, "sequential_3step_stateful": 472.71, "conditional_routing_stateful": 1047.48, "sequential_reasoning_stateful": 413.98, "error_recovery_stateful": 459.71, "data_gap_recovery_stateful": 905.37, "data_gap_recovery_extended_stateful": 1283.58, "argument_transformation_stateful": 2440.56, "grounded_synthesis_stateful": 1741.41, "inconsistent_api_recovery_stateful": 3714.22}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-Nemo-Instruct-2407.Q4_K_M LF/P [reforged:full]", "model": "Mistral-Nemo-Instruct-2407.Q4_K_M", "backend": "llamafile", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 64.8, "accuracy": 66.6, "completeness": 97.3, "efficiency": 100.0, "wasted": 0.4, "speed": 4.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 96, "error_recovery": 92, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 82, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 50, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 88, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 68, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 48, "error_recovery": 46, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 41, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 25, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 44, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 31}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 31}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 192, "error_recovery": 92, "data_gap_recovery": 10, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 410, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 75, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 132, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 340, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 121, "basic_2step": 101, "sequential_3step": 150, "conditional_routing": 247, "sequential_reasoning": 189, "error_recovery": 146, "data_gap_recovery": 8, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 256, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 75, "basic_2step_stateful": 103, "sequential_3step_stateful": 150, "conditional_routing_stateful": 248, "sequential_reasoning_stateful": 197, "error_recovery_stateful": 136, "data_gap_recovery_stateful": 13, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 216, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 1.0, "sequential_3step": 0.0, "conditional_routing": 62.0, "sequential_reasoning": 1.0, "error_recovery": 60.0, "data_gap_recovery": 47.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 141.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 3.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 61.0, "sequential_reasoning_stateful": 3.0, "error_recovery_stateful": 7.0, "data_gap_recovery_stateful": 52.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 121.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 31}, "scenarioSpeedSum": {"relevance_detection": 37.18, "argument_fidelity": 99.91, "tool_selection": 83.36, "basic_2step": 66.41, "sequential_3step": 95.75, "conditional_routing": 279.02, "sequential_reasoning": 151.46, "error_recovery": 102.95, "data_gap_recovery": 311.3, "data_gap_recovery_extended": 415.25, "argument_transformation": 335.33, "grounded_synthesis": 657.15, "inconsistent_api_recovery": 396.82, "relevance_detection_stateful": 37.8, "argument_fidelity_stateful": 99.85, "tool_selection_stateful": 81.45, "basic_2step_stateful": 75.02, "sequential_3step_stateful": 95.86, "conditional_routing_stateful": 307.41, "sequential_reasoning_stateful": 153.92, "error_recovery_stateful": 94.02, "data_gap_recovery_stateful": 337.48, "data_gap_recovery_extended_stateful": 414.34, "argument_transformation_stateful": 327.11, "grounded_synthesis_stateful": 676.49, "inconsistent_api_recovery_stateful": 289.79}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 50, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 31}}, {"label": "granite-4.1-8b-Q4_K_M LS/N [reforged:keep-last]", "model": "granite-4.1-8b-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "keep-last", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 65.4, "accuracy": 68.0, "completeness": 96.2, "efficiency": 89.7, "wasted": 0.8, "speed": 1.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 200, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 100.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 150.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 400.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 150.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 21.61, "argument_fidelity": 41.64, "tool_selection": 34.16, "basic_2step": 22.62, "sequential_3step": 34.65, "conditional_routing": 268.31, "sequential_reasoning": 53.24, "error_recovery": 35.59, "data_gap_recovery": 131.28, "data_gap_recovery_extended": 169.99, "argument_transformation": 0.0, "grounded_synthesis": 162.03, "inconsistent_api_recovery": 159.88, "relevance_detection_stateful": 22.64, "argument_fidelity_stateful": 42.76, "tool_selection_stateful": 34.4, "basic_2step_stateful": 25.07, "sequential_3step_stateful": 34.57, "conditional_routing_stateful": 97.67, "sequential_reasoning_stateful": 59.53, "error_recovery_stateful": 36.06, "data_gap_recovery_stateful": 135.34, "data_gap_recovery_extended_stateful": 176.35, "argument_transformation_stateful": 171.99, "grounded_synthesis_stateful": 149.04, "inconsistent_api_recovery_stateful": 167.14}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.1-8b-Q4_K_M LS/N [reforged:full]", "model": "granite-4.1-8b-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 65.4, "accuracy": 68.0, "completeness": 96.2, "efficiency": 89.7, "wasted": 0.8, "speed": 1.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 200, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 100.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 150.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 400.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 150.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 22.12, "argument_fidelity": 42.64, "tool_selection": 35.17, "basic_2step": 23.11, "sequential_3step": 35.6, "conditional_routing": 275.95, "sequential_reasoning": 54.66, "error_recovery": 36.59, "data_gap_recovery": 135.14, "data_gap_recovery_extended": 174.43, "argument_transformation": 0.0, "grounded_synthesis": 166.42, "inconsistent_api_recovery": 164.09, "relevance_detection_stateful": 23.27, "argument_fidelity_stateful": 43.76, "tool_selection_stateful": 35.36, "basic_2step_stateful": 26.06, "sequential_3step_stateful": 35.54, "conditional_routing_stateful": 100.41, "sequential_reasoning_stateful": 61.02, "error_recovery_stateful": 37.05, "data_gap_recovery_stateful": 139.05, "data_gap_recovery_extended_stateful": 181.11, "argument_transformation_stateful": 176.55, "grounded_synthesis_stateful": 153.22, "inconsistent_api_recovery_stateful": 171.64}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.1-8b-Q4_K_M LS/N [reforged]", "model": "granite-4.1-8b-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "none", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 65.4, "accuracy": 68.0, "completeness": 96.2, "efficiency": 89.7, "wasted": 0.8, "speed": 1.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 200, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 100.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 150.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 400.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 150.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 22.15, "argument_fidelity": 42.83, "tool_selection": 35.17, "basic_2step": 23.15, "sequential_3step": 35.62, "conditional_routing": 276.27, "sequential_reasoning": 54.89, "error_recovery": 36.59, "data_gap_recovery": 135.17, "data_gap_recovery_extended": 174.89, "argument_transformation": 0.0, "grounded_synthesis": 167.2, "inconsistent_api_recovery": 164.45, "relevance_detection_stateful": 23.27, "argument_fidelity_stateful": 44.01, "tool_selection_stateful": 35.79, "basic_2step_stateful": 26.25, "sequential_3step_stateful": 35.86, "conditional_routing_stateful": 101.32, "sequential_reasoning_stateful": 61.64, "error_recovery_stateful": 37.25, "data_gap_recovery_stateful": 139.49, "data_gap_recovery_extended_stateful": 181.55, "argument_transformation_stateful": 177.06, "grounded_synthesis_stateful": 153.61, "inconsistent_api_recovery_stateful": 172.16}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.0-h-micro-Q4_K_M LS/N [reforged:full]", "model": "granite-4.0-h-micro-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "granite-4.0-h-micro", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 65.3, "accuracy": 70.8, "completeness": 92.3, "efficiency": 89.9, "wasted": 0.4, "speed": 2.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 100, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 100, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 100, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 500, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 250, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 200, "data_gap_recovery": 200, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 454, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 250, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 200, "data_gap_recovery_stateful": 196, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 100.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 100.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 2.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 100.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 25.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 39.51, "argument_fidelity": 75.28, "tool_selection": 109.15, "basic_2step": 44.05, "sequential_3step": 68.53, "conditional_routing": 219.79, "sequential_reasoning": 90.97, "error_recovery": 73.56, "data_gap_recovery": 157.72, "data_gap_recovery_extended": 218.88, "argument_transformation": 134.18, "grounded_synthesis": 271.78, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 39.37, "argument_fidelity_stateful": 74.85, "tool_selection_stateful": 108.9, "basic_2step_stateful": 51.7, "sequential_3step_stateful": 68.39, "conditional_routing_stateful": 151.95, "sequential_reasoning_stateful": 228.81, "error_recovery_stateful": 73.11, "data_gap_recovery_stateful": 163.44, "data_gap_recovery_extended_stateful": 220.78, "argument_transformation_stateful": 134.69, "grounded_synthesis_stateful": 303.06, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}}, {"label": "Qwen3.6-35B-A3B-UD-Q4_K_M LS/P [bare:full]", "model": "Qwen3.6-35B-A3B-UD-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "qwen3.6-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 65.5, "accuracy": 75.6, "completeness": 86.6, "efficiency": 100.0, "wasted": 0.1, "speed": 23.0, "n": 50, "scenarios": {"relevance_detection": 96, "argument_fidelity": 100, "tool_selection": 32, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 96, "error_recovery": 0, "data_gap_recovery": 94, "data_gap_recovery_extended": 16, "argument_transformation": 52, "grounded_synthesis": 46, "inconsistent_api_recovery": 98, "relevance_detection_stateful": 88, "argument_fidelity_stateful": 100, "tool_selection_stateful": 28, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 60, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 48, "argument_fidelity": 50, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 8, "argument_transformation": 26, "grounded_synthesis": 23, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 50, "tool_selection_stateful": 14, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 30, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 25, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 14, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 14, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 48, "argument_fidelity": 150, "tool_selection": 48, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 235, "data_gap_recovery_extended": 64, "argument_transformation": 130, "grounded_synthesis": 230, "inconsistent_api_recovery": 392, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 150, "tool_selection_stateful": 42, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 120, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 125, "grounded_synthesis_stateful": 250, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 48, "argument_fidelity": 150, "tool_selection": 48, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 151, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 190, "data_gap_recovery_extended": 27, "argument_transformation": 96, "grounded_synthesis": 151, "inconsistent_api_recovery": 323, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 150, "tool_selection_stateful": 42, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 88, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 197, "data_gap_recovery_extended_stateful": 18, "argument_transformation_stateful": 90, "grounded_synthesis_stateful": 134, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 5.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 6.0, "grounded_synthesis": 2.0, "inconsistent_api_recovery": 28.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 4.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 2.0, "grounded_synthesis_stateful": 4.0, "inconsistent_api_recovery_stateful": 24.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 14, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 347.79, "argument_fidelity": 296.49, "tool_selection": 68.71, "basic_2step": 426.66, "sequential_3step": 472.57, "conditional_routing": 609.54, "sequential_reasoning": 541.01, "error_recovery": 0.0, "data_gap_recovery": 914.84, "data_gap_recovery_extended": 1059.1, "argument_transformation": 2353.43, "grounded_synthesis": 2576.22, "inconsistent_api_recovery": 3609.53, "relevance_detection_stateful": 350.41, "argument_fidelity_stateful": 278.55, "tool_selection_stateful": 53.36, "basic_2step_stateful": 504.91, "sequential_3step_stateful": 524.72, "conditional_routing_stateful": 543.93, "sequential_reasoning_stateful": 759.5, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 998.17, "data_gap_recovery_extended_stateful": 1020.58, "argument_transformation_stateful": 2034.93, "grounded_synthesis_stateful": 2568.03, "inconsistent_api_recovery_stateful": 2948.49}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 14, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "Qwen3-14B-Q4_K_M LS/N [reforged:keep-last]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "keep-last", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 64.0, "accuracy": 64.0, "completeness": 99.9, "efficiency": 91.1, "wasted": 0.6, "speed": 20.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 90, "sequential_reasoning": 98, "error_recovery": 48, "data_gap_recovery": 40, "data_gap_recovery_extended": 18, "argument_transformation": 6, "grounded_synthesis": 38, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 98, "sequential_3step_stateful": 96, "conditional_routing_stateful": 88, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 54, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 45, "sequential_reasoning": 49, "error_recovery": 24, "data_gap_recovery": 20, "data_gap_recovery_extended": 9, "argument_transformation": 3, "grounded_synthesis": 19, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 49, "sequential_3step_stateful": 48, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 27, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 19, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 180, "sequential_reasoning": 196, "error_recovery": 48, "data_gap_recovery": 100, "data_gap_recovery_extended": 72, "argument_transformation": 15, "grounded_synthesis": 190, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 98, "sequential_3step_stateful": 144, "conditional_routing_stateful": 176, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 81, "data_gap_recovery_stateful": 75, "data_gap_recovery_extended_stateful": 64, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 190, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 210, "tool_selection": 219, "basic_2step": 170, "sequential_3step": 184, "conditional_routing": 179, "sequential_reasoning": 231, "error_recovery": 94, "data_gap_recovery": 82, "data_gap_recovery_extended": 31, "argument_transformation": 15, "grounded_synthesis": 111, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 211, "tool_selection_stateful": 220, "basic_2step_stateful": 149, "sequential_3step_stateful": 163, "conditional_routing_stateful": 167, "sequential_reasoning_stateful": 232, "error_recovery_stateful": 107, "data_gap_recovery_stateful": 57, "data_gap_recovery_extended_stateful": 29, "argument_transformation_stateful": 9, "grounded_synthesis_stateful": 87, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 60.0, "tool_selection": 69.0, "basic_2step": 70.0, "sequential_3step": 34.0, "conditional_routing": 39.0, "sequential_reasoning": 36.0, "error_recovery": 102.0, "data_gap_recovery": 10.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 24.0, "grounded_synthesis": 5.0, "inconsistent_api_recovery": 4.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 61.0, "tool_selection_stateful": 70.0, "basic_2step_stateful": 51.0, "sequential_3step_stateful": 19.0, "conditional_routing_stateful": 40.0, "sequential_reasoning_stateful": 32.0, "error_recovery_stateful": 58.0, "data_gap_recovery_stateful": 8.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 32.0, "grounded_synthesis_stateful": 5.0, "inconsistent_api_recovery_stateful": 7.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 98.17, "argument_fidelity": 514.66, "tool_selection": 563.8, "basic_2step": 412.69, "sequential_3step": 776.57, "conditional_routing": 1027.01, "sequential_reasoning": 999.64, "error_recovery": 574.47, "data_gap_recovery": 804.33, "data_gap_recovery_extended": 1068.22, "argument_transformation": 2867.92, "grounded_synthesis": 2068.33, "inconsistent_api_recovery": 1510.92, "relevance_detection_stateful": 99.42, "argument_fidelity_stateful": 531.21, "tool_selection_stateful": 519.74, "basic_2step_stateful": 327.64, "sequential_3step_stateful": 664.5, "conditional_routing_stateful": 1048.48, "sequential_reasoning_stateful": 931.79, "error_recovery_stateful": 604.97, "data_gap_recovery_stateful": 784.05, "data_gap_recovery_extended_stateful": 1066.59, "argument_transformation_stateful": 2806.01, "grounded_synthesis_stateful": 2042.66, "inconsistent_api_recovery_stateful": 1633.65}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q8_0 LS/P [bare]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "none", "family": "qwen3-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 63.5, "accuracy": 69.6, "completeness": 91.3, "efficiency": 96.6, "wasted": 0.2, "speed": 27.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 96, "basic_2step": 98, "sequential_3step": 100, "conditional_routing": 98, "sequential_reasoning": 96, "error_recovery": 0, "data_gap_recovery": 90, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 18, "inconsistent_api_recovery": 92, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 94, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 64, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 84, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 9, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 144, "basic_2step": 98, "sequential_3step": 150, "conditional_routing": 196, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 225, "data_gap_recovery_extended": 0, "argument_transformation": 20, "grounded_synthesis": 90, "inconsistent_api_recovery": 368, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 141, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 128, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 210, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 60, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 144, "basic_2step": 98, "sequential_3step": 153, "conditional_routing": 235, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 229, "data_gap_recovery_extended": 0, "argument_transformation": 16, "grounded_synthesis": 78, "inconsistent_api_recovery": 404, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 141, "basic_2step_stateful": 100, "sequential_3step_stateful": 153, "conditional_routing_stateful": 159, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 219, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 53, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 3.0, "conditional_routing": 40.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 5.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 2.0, "inconsistent_api_recovery": 49.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 3.0, "conditional_routing_stateful": 31.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 10.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 2.0, "inconsistent_api_recovery_stateful": 48.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 85.82, "argument_fidelity": 293.67, "tool_selection": 343.59, "basic_2step": 253.05, "sequential_3step": 683.96, "conditional_routing": 1238.64, "sequential_reasoning": 470.86, "error_recovery": 0.0, "data_gap_recovery": 1001.81, "data_gap_recovery_extended": 1563.53, "argument_transformation": 2821.05, "grounded_synthesis": 3452.24, "inconsistent_api_recovery": 4038.12, "relevance_detection_stateful": 79.22, "argument_fidelity_stateful": 303.71, "tool_selection_stateful": 326.43, "basic_2step_stateful": 292.88, "sequential_3step_stateful": 713.2, "conditional_routing_stateful": 1157.18, "sequential_reasoning_stateful": 490.87, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 986.13, "data_gap_recovery_extended_stateful": 1449.76, "argument_transformation_stateful": 2815.5, "grounded_synthesis_stateful": 2989.33, "inconsistent_api_recovery_stateful": 4247.36}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 48, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q8_0 LS/P [bare:full]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "qwen3-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 63.9, "accuracy": 69.9, "completeness": 91.5, "efficiency": 96.1, "wasted": 0.2, "speed": 27.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 94, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 98, "error_recovery": 0, "data_gap_recovery": 88, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 14, "inconsistent_api_recovery": 90, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 98, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 68, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 96, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 7, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 141, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 220, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 70, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 136, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 240, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 141, "basic_2step": 100, "sequential_3step": 151, "conditional_routing": 223, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 226, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 62, "inconsistent_api_recovery": 406, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 152, "conditional_routing_stateful": 170, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 251, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 95, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 1.0, "conditional_routing": 33.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 7.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 1.0, "inconsistent_api_recovery": 51.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 2.0, "conditional_routing_stateful": 34.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 12.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 5.0, "inconsistent_api_recovery_stateful": 71.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 85.54, "argument_fidelity": 299.55, "tool_selection": 323.72, "basic_2step": 261.56, "sequential_3step": 710.06, "conditional_routing": 1143.59, "sequential_reasoning": 481.06, "error_recovery": 0.0, "data_gap_recovery": 960.26, "data_gap_recovery_extended": 1506.67, "argument_transformation": 2563.49, "grounded_synthesis": 3307.0, "inconsistent_api_recovery": 3955.08, "relevance_detection_stateful": 84.67, "argument_fidelity_stateful": 291.31, "tool_selection_stateful": 353.28, "basic_2step_stateful": 298.05, "sequential_3step_stateful": 687.65, "conditional_routing_stateful": 1137.44, "sequential_reasoning_stateful": 465.86, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1034.69, "data_gap_recovery_extended_stateful": 1503.63, "argument_transformation_stateful": 2896.03, "grounded_synthesis_stateful": 3550.46, "inconsistent_api_recovery_stateful": 4508.53}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [bare]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "none", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 63.6, "accuracy": 78.5, "completeness": 81.0, "efficiency": 100.0, "wasted": 0.3, "speed": 3.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 82, "sequential_reasoning": 90, "error_recovery": 0, "data_gap_recovery": 60, "data_gap_recovery_extended": 20, "argument_transformation": 4, "grounded_synthesis": 46, "inconsistent_api_recovery": 72, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 96, "conditional_routing_stateful": 72, "sequential_reasoning_stateful": 96, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 56, "data_gap_recovery_extended_stateful": 22, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 41, "sequential_reasoning": 45, "error_recovery": 0, "data_gap_recovery": 30, "data_gap_recovery_extended": 10, "argument_transformation": 2, "grounded_synthesis": 23, "inconsistent_api_recovery": 36, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 28, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 45, "error_recovery": 0, "data_gap_recovery": 37, "data_gap_recovery_extended": 38, "argument_transformation": 32, "grounded_synthesis": 39, "inconsistent_api_recovery": 36, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 33, "argument_transformation_stateful": 31, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 42}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 45, "error_recovery": 0, "data_gap_recovery": 37, "data_gap_recovery_extended": 38, "argument_transformation": 32, "grounded_synthesis": 39, "inconsistent_api_recovery": 36, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 33, "argument_transformation_stateful": 31, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 42}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 164, "sequential_reasoning": 180, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 80, "argument_transformation": 10, "grounded_synthesis": 230, "inconsistent_api_recovery": 288, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 144, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 140, "data_gap_recovery_extended_stateful": 88, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 200, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 192, "sequential_reasoning": 180, "error_recovery": 0, "data_gap_recovery": 112, "data_gap_recovery_extended": 37, "argument_transformation": 10, "grounded_synthesis": 160, "inconsistent_api_recovery": 365, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 168, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 115, "data_gap_recovery_extended_stateful": 41, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 126, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 34.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 16.0, "grounded_synthesis": 18.0, "inconsistent_api_recovery": 77.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 30.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 23.0, "grounded_synthesis_stateful": 8.0, "inconsistent_api_recovery_stateful": 72.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 45, "error_recovery": 0, "data_gap_recovery": 37, "data_gap_recovery_extended": 38, "argument_transformation": 32, "grounded_synthesis": 39, "inconsistent_api_recovery": 36, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 33, "argument_transformation_stateful": 31, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 42}, "scenarioSpeedSum": {"relevance_detection": 17.89, "argument_fidelity": 61.17, "tool_selection": 51.64, "basic_2step": 31.06, "sequential_3step": 70.92, "conditional_routing": 188.83, "sequential_reasoning": 81.47, "error_recovery": 0.0, "data_gap_recovery": 185.77, "data_gap_recovery_extended": 226.0, "argument_transformation": 276.17, "grounded_synthesis": 411.35, "inconsistent_api_recovery": 190.1, "relevance_detection_stateful": 17.27, "argument_fidelity_stateful": 63.59, "tool_selection_stateful": 51.43, "basic_2step_stateful": 35.06, "sequential_3step_stateful": 75.55, "conditional_routing_stateful": 173.87, "sequential_reasoning_stateful": 84.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 126.46, "data_gap_recovery_extended_stateful": 221.66, "argument_transformation_stateful": 192.36, "grounded_synthesis_stateful": 394.95, "inconsistent_api_recovery_stateful": 209.14}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 45, "error_recovery": 0, "data_gap_recovery": 37, "data_gap_recovery_extended": 38, "argument_transformation": 32, "grounded_synthesis": 39, "inconsistent_api_recovery": 36, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 33, "argument_transformation_stateful": 31, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 42}}, {"label": "Qwen3-8B-Q8_0 LS/P [bare:keep-last]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "keep-last", "family": "qwen3-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 63.4, "accuracy": 69.7, "completeness": 90.9, "efficiency": 96.0, "wasted": 0.2, "speed": 28.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 94, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 98, "sequential_reasoning": 98, "error_recovery": 0, "data_gap_recovery": 88, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 16, "inconsistent_api_recovery": 92, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 88, "basic_2step_stateful": 100, "sequential_3step_stateful": 94, "conditional_routing_stateful": 72, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 96, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 8, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 141, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 196, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 220, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 80, "inconsistent_api_recovery": 368, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 132, "basic_2step_stateful": 100, "sequential_3step_stateful": 141, "conditional_routing_stateful": 144, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 240, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 141, "basic_2step": 100, "sequential_3step": 152, "conditional_routing": 217, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 228, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 73, "inconsistent_api_recovery": 417, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 132, "basic_2step_stateful": 100, "sequential_3step_stateful": 142, "conditional_routing_stateful": 180, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 253, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 2.0, "conditional_routing": 25.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 10.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 0.0, "grounded_synthesis": 8.0, "inconsistent_api_recovery": 59.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 1.0, "conditional_routing_stateful": 36.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 13.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 5.0, "inconsistent_api_recovery_stateful": 51.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 77.71, "argument_fidelity": 308.21, "tool_selection": 335.17, "basic_2step": 252.07, "sequential_3step": 678.53, "conditional_routing": 1132.22, "sequential_reasoning": 483.41, "error_recovery": 0.0, "data_gap_recovery": 985.41, "data_gap_recovery_extended": 1558.86, "argument_transformation": 2946.78, "grounded_synthesis": 3652.01, "inconsistent_api_recovery": 4313.61, "relevance_detection_stateful": 83.58, "argument_fidelity_stateful": 304.19, "tool_selection_stateful": 309.56, "basic_2step_stateful": 319.67, "sequential_3step_stateful": 655.56, "conditional_routing_stateful": 1219.0, "sequential_reasoning_stateful": 488.79, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1014.31, "data_gap_recovery_extended_stateful": 1531.44, "argument_transformation_stateful": 3077.52, "grounded_synthesis_stateful": 3758.58, "inconsistent_api_recovery_stateful": 4247.82}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 44, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [bare:keep-last]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "keep-last", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 63.4, "accuracy": 76.7, "completeness": 82.7, "efficiency": 100.0, "wasted": 0.3, "speed": 3.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 68, "sequential_reasoning": 98, "error_recovery": 0, "data_gap_recovery": 64, "data_gap_recovery_extended": 20, "argument_transformation": 4, "grounded_synthesis": 46, "inconsistent_api_recovery": 78, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 96, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 68, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 58, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 34, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 32, "data_gap_recovery_extended": 10, "argument_transformation": 2, "grounded_synthesis": 23, "inconsistent_api_recovery": 39, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 29, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 22, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 40, "data_gap_recovery_extended": 34, "argument_transformation": 43, "grounded_synthesis": 42, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 40}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 40, "data_gap_recovery_extended": 34, "argument_transformation": 43, "grounded_synthesis": 42, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 40}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 136, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 160, "data_gap_recovery_extended": 80, "argument_transformation": 10, "grounded_synthesis": 230, "inconsistent_api_recovery": 312, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 144, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 136, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 145, "data_gap_recovery_extended_stateful": 56, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 220, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 165, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 121, "data_gap_recovery_extended": 35, "argument_transformation": 10, "grounded_synthesis": 182, "inconsistent_api_recovery": 381, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 144, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 159, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 110, "data_gap_recovery_extended_stateful": 30, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 153, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 31.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 18.0, "grounded_synthesis": 34.0, "inconsistent_api_recovery": 75.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 28.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 15.0, "grounded_synthesis_stateful": 15.0, "inconsistent_api_recovery_stateful": 74.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 40, "data_gap_recovery_extended": 34, "argument_transformation": 43, "grounded_synthesis": 42, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 40}, "scenarioSpeedSum": {"relevance_detection": 16.93, "argument_fidelity": 59.2, "tool_selection": 49.44, "basic_2step": 29.8, "sequential_3step": 62.14, "conditional_routing": 177.12, "sequential_reasoning": 82.09, "error_recovery": 0.0, "data_gap_recovery": 167.47, "data_gap_recovery_extended": 221.0, "argument_transformation": 280.04, "grounded_synthesis": 417.98, "inconsistent_api_recovery": 200.48, "relevance_detection_stateful": 16.83, "argument_fidelity_stateful": 60.57, "tool_selection_stateful": 47.25, "basic_2step_stateful": 33.14, "sequential_3step_stateful": 63.21, "conditional_routing_stateful": 157.5, "sequential_reasoning_stateful": 80.03, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 157.96, "data_gap_recovery_extended_stateful": 195.6, "argument_transformation_stateful": 249.89, "grounded_synthesis_stateful": 373.39, "inconsistent_api_recovery_stateful": 192.91}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 40, "data_gap_recovery_extended": 34, "argument_transformation": 43, "grounded_synthesis": 42, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 48, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 40}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [bare:full]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 63.5, "accuracy": 77.3, "completeness": 82.1, "efficiency": 100.0, "wasted": 0.3, "speed": 3.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 76, "sequential_reasoning": 96, "error_recovery": 0, "data_gap_recovery": 64, "data_gap_recovery_extended": 12, "argument_transformation": 2, "grounded_synthesis": 40, "inconsistent_api_recovery": 70, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 98, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 70, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 60, "data_gap_recovery_extended_stateful": 18, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 52, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 38, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 32, "data_gap_recovery_extended": 6, "argument_transformation": 1, "grounded_synthesis": 20, "inconsistent_api_recovery": 35, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 35, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 36, "argument_transformation": 35, "grounded_synthesis": 38, "inconsistent_api_recovery": 35, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 43}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 36, "argument_transformation": 35, "grounded_synthesis": 38, "inconsistent_api_recovery": 35, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 43}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 152, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 160, "data_gap_recovery_extended": 48, "argument_transformation": 5, "grounded_synthesis": 200, "inconsistent_api_recovery": 280, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 140, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 72, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 260, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 175, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 131, "data_gap_recovery_extended": 18, "argument_transformation": 5, "grounded_synthesis": 148, "inconsistent_api_recovery": 348, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 157, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 114, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 167, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 30.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 3.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 21.0, "grounded_synthesis": 23.0, "inconsistent_api_recovery": 68.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 26.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 10.0, "grounded_synthesis_stateful": 13.0, "inconsistent_api_recovery_stateful": 84.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 36, "argument_transformation": 35, "grounded_synthesis": 38, "inconsistent_api_recovery": 35, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 43}, "scenarioSpeedSum": {"relevance_detection": 17.38, "argument_fidelity": 61.22, "tool_selection": 53.62, "basic_2step": 30.69, "sequential_3step": 64.17, "conditional_routing": 181.15, "sequential_reasoning": 86.03, "error_recovery": 0.0, "data_gap_recovery": 148.56, "data_gap_recovery_extended": 221.48, "argument_transformation": 262.38, "grounded_synthesis": 381.0, "inconsistent_api_recovery": 181.33, "relevance_detection_stateful": 17.3, "argument_fidelity_stateful": 60.97, "tool_selection_stateful": 52.07, "basic_2step_stateful": 34.83, "sequential_3step_stateful": 64.77, "conditional_routing_stateful": 187.06, "sequential_reasoning_stateful": 81.68, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 171.54, "data_gap_recovery_extended_stateful": 182.57, "argument_transformation_stateful": 260.87, "grounded_synthesis_stateful": 367.24, "inconsistent_api_recovery_stateful": 221.5}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 36, "argument_transformation": 35, "grounded_synthesis": 38, "inconsistent_api_recovery": 35, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 34, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 43}}, {"label": "granite-4.1-8b-Q4_K_M LS/P [reforged]", "model": "granite-4.1-8b-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "none", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 61.5, "accuracy": 61.5, "completeness": 100.0, "efficiency": 90.4, "wasted": 0.3, "speed": 2.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 100, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 100, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 250, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 250, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 350, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 150, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 150, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 350, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 150, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 200.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 50.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 200.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 54.56, "argument_fidelity": 39.66, "tool_selection": 155.65, "basic_2step": 20.04, "sequential_3step": 40.03, "conditional_routing": 106.19, "sequential_reasoning": 53.06, "error_recovery": 22.68, "data_gap_recovery": 75.1, "data_gap_recovery_extended": 177.05, "argument_transformation": 470.58, "grounded_synthesis": 146.29, "inconsistent_api_recovery": 163.01, "relevance_detection_stateful": 57.16, "argument_fidelity_stateful": 40.51, "tool_selection_stateful": 156.36, "basic_2step_stateful": 22.02, "sequential_3step_stateful": 40.04, "conditional_routing_stateful": 101.13, "sequential_reasoning_stateful": 52.53, "error_recovery_stateful": 23.03, "data_gap_recovery_stateful": 76.73, "data_gap_recovery_extended_stateful": 177.03, "argument_transformation_stateful": 470.2, "grounded_synthesis_stateful": 133.59, "inconsistent_api_recovery_stateful": 366.07}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/P [bare]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "none", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 62.5, "accuracy": 67.8, "completeness": 92.2, "efficiency": 91.5, "wasted": 0.4, "speed": 9.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 56, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 88, "data_gap_recovery_extended": 0, "argument_transformation": 22, "grounded_synthesis": 30, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 10, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 28, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 0, "argument_transformation": 11, "grounded_synthesis": 15, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 5, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 15, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 112, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 220, "data_gap_recovery_extended": 0, "argument_transformation": 55, "grounded_synthesis": 150, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 20, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 119, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 240, "data_gap_recovery_extended": 0, "argument_transformation": 44, "grounded_synthesis": 221, "inconsistent_api_recovery": 518, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 24, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 238, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 196, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 7.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 25.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 134.0, "inconsistent_api_recovery": 118.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 4.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 23.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 113.0, "inconsistent_api_recovery_stateful": 111.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 30.66, "argument_fidelity": 89.98, "tool_selection": 117.02, "basic_2step": 52.3, "sequential_3step": 112.42, "conditional_routing": 470.1, "sequential_reasoning": 181.03, "error_recovery": 0.0, "data_gap_recovery": 590.22, "data_gap_recovery_extended": 463.52, "argument_transformation": 1219.44, "grounded_synthesis": 916.77, "inconsistent_api_recovery": 1159.13, "relevance_detection_stateful": 38.37, "argument_fidelity_stateful": 96.67, "tool_selection_stateful": 106.85, "basic_2step_stateful": 53.6, "sequential_3step_stateful": 111.36, "conditional_routing_stateful": 446.28, "sequential_reasoning_stateful": 199.52, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 552.32, "data_gap_recovery_extended_stateful": 487.94, "argument_transformation_stateful": 1176.65, "grounded_synthesis_stateful": 946.2, "inconsistent_api_recovery_stateful": 1168.31}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "granite-4.1-8b-Q8_0 LS/P [reforged]", "model": "granite-4.1-8b-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "none", "family": "granite-4.1-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 61.5, "accuracy": 66.7, "completeness": 92.3, "efficiency": 73.4, "wasted": 1.0, "speed": 5.2, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 100, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 500, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 350, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 300, "sequential_reasoning": 200, "error_recovery": 200, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 695, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 350, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 300, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 200, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 695, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 200.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 100.0, "sequential_reasoning": 0.0, "error_recovery": 100.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 196.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 200.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 100.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 196.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 62.04, "tool_selection": 227.67, "basic_2step": 31.08, "sequential_3step": 61.55, "conditional_routing": 303.66, "sequential_reasoning": 78.6, "error_recovery": 68.5, "data_gap_recovery": 163.59, "data_gap_recovery_extended": 186.68, "argument_transformation": 1139.6, "grounded_synthesis": 275.12, "inconsistent_api_recovery": 531.2, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 62.66, "tool_selection_stateful": 228.14, "basic_2step_stateful": 35.06, "sequential_3step_stateful": 63.04, "conditional_routing_stateful": 292.34, "sequential_reasoning_stateful": 75.54, "error_recovery_stateful": 69.04, "data_gap_recovery_stateful": 167.63, "data_gap_recovery_extended_stateful": 186.61, "argument_transformation_stateful": 1139.13, "grounded_synthesis_stateful": 274.76, "inconsistent_api_recovery_stateful": 518.75}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma4:e4b-it-q8_0 OL/N [bare:full]", "model": "gemma4:e4b-it-q8_0", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "gemma4-e4b", "quant": "q8_0", "gen": 2, "retired": false, "score": 62.5, "accuracy": 69.9, "completeness": 89.3, "efficiency": 88.9, "wasted": 0.5, "speed": 12.0, "n": 50, "scenarios": {"relevance_detection": 90, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 86, "sequential_reasoning": 92, "error_recovery": 0, "data_gap_recovery": 72, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 98, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 66, "sequential_reasoning_stateful": 94, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 84, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 24, "inconsistent_api_recovery": 25, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 33, "sequential_reasoning_stateful": 47, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 21, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 47, "argument_transformation": 48, "grounded_synthesis": 49, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioValidated": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 47, "argument_transformation": 48, "grounded_synthesis": 49, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioIdealCalls": {"relevance_detection": 45, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 172, "sequential_reasoning": 184, "error_recovery": 0, "data_gap_recovery": 180, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 240, "inconsistent_api_recovery": 200, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 132, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 210, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 210, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 45, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 206, "sequential_reasoning": 184, "error_recovery": 0, "data_gap_recovery": 191, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 365, "inconsistent_api_recovery": 244, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 164, "sequential_reasoning_stateful": 188, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 224, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 313, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 34.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 14.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 187.0, "inconsistent_api_recovery": 71.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 32.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 18.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 159.0, "inconsistent_api_recovery_stateful": 68.0}, "scenarioWastedN": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 47, "argument_transformation": 48, "grounded_synthesis": 49, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioSpeedSum": {"relevance_detection": 104.32, "argument_fidelity": 144.59, "tool_selection": 135.98, "basic_2step": 86.76, "sequential_3step": 146.31, "conditional_routing": 524.41, "sequential_reasoning": 231.32, "error_recovery": 0.0, "data_gap_recovery": 547.74, "data_gap_recovery_extended": 823.16, "argument_transformation": 1444.28, "grounded_synthesis": 1187.06, "inconsistent_api_recovery": 1562.61, "relevance_detection_stateful": 122.64, "argument_fidelity_stateful": 138.12, "tool_selection_stateful": 134.42, "basic_2step_stateful": 78.11, "sequential_3step_stateful": 154.8, "conditional_routing_stateful": 545.16, "sequential_reasoning_stateful": 245.18, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 654.16, "data_gap_recovery_extended_stateful": 756.53, "argument_transformation_stateful": 1450.34, "grounded_synthesis_stateful": 1143.72, "inconsistent_api_recovery_stateful": 1561.77}, "scenarioSpeedN": {"relevance_detection": 45, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 47, "argument_transformation": 48, "grounded_synthesis": 49, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}}, {"label": "gemma4:e4b-it-q4_K_M OL/N [bare:full]", "model": "gemma4:e4b-it-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 62.3, "accuracy": 71.7, "completeness": 86.9, "efficiency": 89.8, "wasted": 0.4, "speed": 8.7, "n": 50, "scenarios": {"relevance_detection": 98, "argument_fidelity": 100, "tool_selection": 88, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 96, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 82, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 50, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 96, "argument_fidelity_stateful": 100, "tool_selection_stateful": 92, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 84, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 72, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 25, "inconsistent_api_recovery": 12, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 46, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 19, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 25, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 46, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 33}, "scenarioValidated": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 25, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 46, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 33}, "scenarioIdealCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 132, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 192, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 205, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 250, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 150, "tool_selection_stateful": 138, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 168, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 180, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 190, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 132, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 236, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 222, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 357, "inconsistent_api_recovery": 113, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 150, "tool_selection_stateful": 138, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 210, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 192, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 275, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 44.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 22.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 160.0, "inconsistent_api_recovery": 22.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 42.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 20.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 8.0, "grounded_synthesis_stateful": 144.0, "inconsistent_api_recovery_stateful": 36.0}, "scenarioWastedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 25, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 46, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 33}, "scenarioSpeedSum": {"relevance_detection": 88.15, "argument_fidelity": 107.22, "tool_selection": 100.8, "basic_2step": 62.48, "sequential_3step": 110.07, "conditional_routing": 416.86, "sequential_reasoning": 221.52, "error_recovery": 0.0, "data_gap_recovery": 534.08, "data_gap_recovery_extended": 705.24, "argument_transformation": 912.43, "grounded_synthesis": 816.48, "inconsistent_api_recovery": 652.24, "relevance_detection_stateful": 99.63, "argument_fidelity_stateful": 107.96, "tool_selection_stateful": 98.95, "basic_2step_stateful": 63.36, "sequential_3step_stateful": 109.32, "conditional_routing_stateful": 451.5, "sequential_reasoning_stateful": 205.7, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 559.44, "data_gap_recovery_extended_stateful": 704.37, "argument_transformation_stateful": 993.37, "grounded_synthesis_stateful": 899.33, "inconsistent_api_recovery_stateful": 776.24}, "scenarioSpeedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 25, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 50, "tool_selection_stateful": 46, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 33}}, {"label": "gemma-4-E4B-it-Q8_0 LS/P [bare:keep-last]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "keep-last", "family": "gemma4-e4b", "quant": "q8_0", "gen": 3, "retired": false, "score": 61.2, "accuracy": 66.4, "completeness": 92.2, "efficiency": 94.6, "wasted": 0.3, "speed": 12.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 68, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 84, "data_gap_recovery_extended": 0, "argument_transformation": 22, "grounded_synthesis": 20, "inconsistent_api_recovery": 90, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 2, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 82, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 34, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 0, "argument_transformation": 11, "grounded_synthesis": 10, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 1, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 41, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 136, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 210, "data_gap_recovery_extended": 0, "argument_transformation": 55, "grounded_synthesis": 100, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 4, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 205, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 120, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 134, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 236, "data_gap_recovery_extended": 0, "argument_transformation": 44, "grounded_synthesis": 106, "inconsistent_api_recovery": 453, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 4, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 232, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 139, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 1.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 35.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 38.0, "inconsistent_api_recovery": 101.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 33.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 44.0, "inconsistent_api_recovery_stateful": 101.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 48.77, "argument_fidelity": 161.79, "tool_selection": 149.08, "basic_2step": 81.41, "sequential_3step": 151.25, "conditional_routing": 570.39, "sequential_reasoning": 282.71, "error_recovery": 0.0, "data_gap_recovery": 779.11, "data_gap_recovery_extended": 800.38, "argument_transformation": 1516.35, "grounded_synthesis": 1171.56, "inconsistent_api_recovery": 1756.86, "relevance_detection_stateful": 49.93, "argument_fidelity_stateful": 171.15, "tool_selection_stateful": 154.19, "basic_2step_stateful": 75.53, "sequential_3step_stateful": 147.94, "conditional_routing_stateful": 546.43, "sequential_reasoning_stateful": 285.48, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 787.48, "data_gap_recovery_extended_stateful": 857.75, "argument_transformation_stateful": 1437.37, "grounded_synthesis_stateful": 1227.72, "inconsistent_api_recovery_stateful": 1731.0}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q8_0 LS/P [bare]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "none", "family": "gemma4-e4b", "quant": "q8_0", "gen": 3, "retired": false, "score": 61.2, "accuracy": 66.3, "completeness": 92.3, "efficiency": 94.1, "wasted": 0.3, "speed": 12.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 64, "sequential_reasoning": 96, "error_recovery": 0, "data_gap_recovery": 84, "data_gap_recovery_extended": 0, "argument_transformation": 22, "grounded_synthesis": 22, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 80, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 32, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 0, "argument_transformation": 11, "grounded_synthesis": 11, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 15, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 128, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 210, "data_gap_recovery_extended": 0, "argument_transformation": 55, "grounded_synthesis": 110, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 200, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 126, "sequential_reasoning": 192, "error_recovery": 0, "data_gap_recovery": 238, "data_gap_recovery_extended": 0, "argument_transformation": 44, "grounded_synthesis": 123, "inconsistent_api_recovery": 479, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 233, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 163, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 1.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 34.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 40.0, "inconsistent_api_recovery": 105.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 39.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 43.0, "inconsistent_api_recovery_stateful": 105.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 44.75, "argument_fidelity": 154.94, "tool_selection": 154.49, "basic_2step": 77.62, "sequential_3step": 148.78, "conditional_routing": 540.11, "sequential_reasoning": 274.84, "error_recovery": 0.0, "data_gap_recovery": 782.73, "data_gap_recovery_extended": 854.18, "argument_transformation": 1369.63, "grounded_synthesis": 1211.71, "inconsistent_api_recovery": 1719.36, "relevance_detection_stateful": 47.46, "argument_fidelity_stateful": 158.35, "tool_selection_stateful": 155.36, "basic_2step_stateful": 74.83, "sequential_3step_stateful": 147.33, "conditional_routing_stateful": 550.88, "sequential_reasoning_stateful": 273.16, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 797.74, "data_gap_recovery_extended_stateful": 817.47, "argument_transformation_stateful": 1340.01, "grounded_synthesis_stateful": 1203.48, "inconsistent_api_recovery_stateful": 1666.8}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q8_0 LS/P [bare:full]", "model": "gemma-4-E4B-it-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "gemma4-e4b", "quant": "q8_0", "gen": 3, "retired": false, "score": 60.5, "accuracy": 65.7, "completeness": 92.1, "efficiency": 94.1, "wasted": 0.3, "speed": 12.6, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 60, "sequential_reasoning": 98, "error_recovery": 0, "data_gap_recovery": 84, "data_gap_recovery_extended": 0, "argument_transformation": 14, "grounded_synthesis": 24, "inconsistent_api_recovery": 90, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 2, "sequential_reasoning_stateful": 96, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 18, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 30, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 0, "argument_transformation": 7, "grounded_synthesis": 12, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 1, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 9, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 120, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 210, "data_gap_recovery_extended": 0, "argument_transformation": 35, "grounded_synthesis": 120, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 4, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 90, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 117, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 235, "data_gap_recovery_extended": 0, "argument_transformation": 28, "grounded_synthesis": 134, "inconsistent_api_recovery": 457, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 5, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 104, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 30.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 37.0, "inconsistent_api_recovery": 104.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 1.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 39.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 47.0, "inconsistent_api_recovery_stateful": 107.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 45.92, "argument_fidelity": 169.0, "tool_selection": 159.44, "basic_2step": 81.95, "sequential_3step": 148.2, "conditional_routing": 551.96, "sequential_reasoning": 280.59, "error_recovery": 0.0, "data_gap_recovery": 777.41, "data_gap_recovery_extended": 849.59, "argument_transformation": 1505.78, "grounded_synthesis": 1186.97, "inconsistent_api_recovery": 1750.66, "relevance_detection_stateful": 51.93, "argument_fidelity_stateful": 156.65, "tool_selection_stateful": 152.01, "basic_2step_stateful": 82.42, "sequential_3step_stateful": 146.49, "conditional_routing_stateful": 552.08, "sequential_reasoning_stateful": 271.92, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 813.48, "data_gap_recovery_extended_stateful": 796.67, "argument_transformation_stateful": 1562.88, "grounded_synthesis_stateful": 1268.98, "inconsistent_api_recovery_stateful": 1696.4}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/P [bare:full]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 61.5, "accuracy": 66.7, "completeness": 92.2, "efficiency": 91.4, "wasted": 0.4, "speed": 9.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 46, "sequential_reasoning": 94, "error_recovery": 0, "data_gap_recovery": 86, "data_gap_recovery_extended": 0, "argument_transformation": 14, "grounded_synthesis": 28, "inconsistent_api_recovery": 88, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 22, "sequential_reasoning_stateful": 96, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 94, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 23, "sequential_reasoning": 47, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 0, "argument_transformation": 7, "grounded_synthesis": 14, "inconsistent_api_recovery": 44, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 11, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 15, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 92, "sequential_reasoning": 188, "error_recovery": 0, "data_gap_recovery": 215, "data_gap_recovery_extended": 0, "argument_transformation": 35, "grounded_synthesis": 140, "inconsistent_api_recovery": 352, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 235, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 101, "sequential_reasoning": 188, "error_recovery": 0, "data_gap_recovery": 238, "data_gap_recovery_extended": 0, "argument_transformation": 27, "grounded_synthesis": 189, "inconsistent_api_recovery": 458, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 54, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 257, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 205, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 9.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 28.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 110.0, "inconsistent_api_recovery": 111.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 10.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 25.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 128.0, "inconsistent_api_recovery_stateful": 116.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 28.53, "argument_fidelity": 82.56, "tool_selection": 110.61, "basic_2step": 48.51, "sequential_3step": 112.27, "conditional_routing": 462.14, "sequential_reasoning": 207.35, "error_recovery": 0.0, "data_gap_recovery": 569.07, "data_gap_recovery_extended": 488.26, "argument_transformation": 1250.05, "grounded_synthesis": 910.88, "inconsistent_api_recovery": 1208.33, "relevance_detection_stateful": 35.11, "argument_fidelity_stateful": 93.54, "tool_selection_stateful": 109.63, "basic_2step_stateful": 47.73, "sequential_3step_stateful": 111.57, "conditional_routing_stateful": 454.79, "sequential_reasoning_stateful": 203.4, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 564.97, "data_gap_recovery_extended_stateful": 487.96, "argument_transformation_stateful": 1250.74, "grounded_synthesis_stateful": 938.36, "inconsistent_api_recovery_stateful": 1176.86}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "gemma-4-E4B-it-Q4_K_M LS/P [bare:keep-last]", "model": "gemma-4-E4B-it-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "keep-last", "family": "gemma4-e4b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 59.9, "accuracy": 65.0, "completeness": 92.2, "efficiency": 92.0, "wasted": 0.4, "speed": 8.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 44, "sequential_reasoning": 98, "error_recovery": 0, "data_gap_recovery": 90, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 22, "inconsistent_api_recovery": 92, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 12, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 92, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 22, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 11, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 6, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 88, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 225, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 110, "inconsistent_api_recovery": 368, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 24, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 60, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 96, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 252, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 157, "inconsistent_api_recovery": 467, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 30, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 251, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 88, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 8.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 31.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 126.0, "inconsistent_api_recovery": 105.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 6.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 28.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 111.0, "inconsistent_api_recovery_stateful": 113.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 37.41, "argument_fidelity": 90.47, "tool_selection": 107.81, "basic_2step": 51.14, "sequential_3step": 101.85, "conditional_routing": 454.66, "sequential_reasoning": 174.38, "error_recovery": 0.0, "data_gap_recovery": 532.79, "data_gap_recovery_extended": 459.64, "argument_transformation": 1123.75, "grounded_synthesis": 898.8, "inconsistent_api_recovery": 1178.1, "relevance_detection_stateful": 33.72, "argument_fidelity_stateful": 87.94, "tool_selection_stateful": 110.09, "basic_2step_stateful": 50.27, "sequential_3step_stateful": 107.5, "conditional_routing_stateful": 455.79, "sequential_reasoning_stateful": 185.08, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 527.62, "data_gap_recovery_extended_stateful": 461.84, "argument_transformation_stateful": 1194.12, "grounded_synthesis_stateful": 933.81, "inconsistent_api_recovery_stateful": 1128.95}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-35B-A3B-Q4_K_M LS/P [bare:full]", "model": "Qwen3.5-35B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "qwen3.5-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 59.7, "accuracy": 77.0, "completeness": 77.5, "efficiency": 100.0, "wasted": 0.1, "speed": 9.8, "n": 50, "scenarios": {"relevance_detection": 30, "argument_fidelity": 100, "tool_selection": 2, "basic_2step": 98, "sequential_3step": 98, "conditional_routing": 96, "sequential_reasoning": 98, "error_recovery": 0, "data_gap_recovery": 92, "data_gap_recovery_extended": 60, "argument_transformation": 20, "grounded_synthesis": 44, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 100, "tool_selection_stateful": 4, "basic_2step_stateful": 92, "sequential_3step_stateful": 100, "conditional_routing_stateful": 90, "sequential_reasoning_stateful": 96, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 58, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 2}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 15, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 49, "sequential_3step": 49, "conditional_routing": 48, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 30, "argument_transformation": 10, "grounded_synthesis": 22, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 24, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 46, "sequential_3step_stateful": 50, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 29, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 14, "inconsistent_api_recovery_stateful": 1}, "scenarioCompleted": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 33, "argument_transformation": 49, "grounded_synthesis": 29, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 47, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 46, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 33, "argument_transformation": 49, "grounded_synthesis": 29, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 47, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 46, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 15, "argument_fidelity": 150, "tool_selection": 3, "basic_2step": 98, "sequential_3step": 147, "conditional_routing": 192, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 230, "data_gap_recovery_extended": 240, "argument_transformation": 50, "grounded_synthesis": 220, "inconsistent_api_recovery": 384, "relevance_detection_stateful": 24, "argument_fidelity_stateful": 150, "tool_selection_stateful": 6, "basic_2step_stateful": 92, "sequential_3step_stateful": 150, "conditional_routing_stateful": 180, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 232, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 140, "inconsistent_api_recovery_stateful": 8}, "scenarioActualCalls": {"relevance_detection": 15, "argument_fidelity": 151, "tool_selection": 3, "basic_2step": 99, "sequential_3step": 148, "conditional_routing": 207, "sequential_reasoning": 197, "error_recovery": 0, "data_gap_recovery": 177, "data_gap_recovery_extended": 116, "argument_transformation": 33, "grounded_synthesis": 116, "inconsistent_api_recovery": 196, "relevance_detection_stateful": 24, "argument_fidelity_stateful": 150, "tool_selection_stateful": 6, "basic_2step_stateful": 93, "sequential_3step_stateful": 151, "conditional_routing_stateful": 185, "sequential_reasoning_stateful": 193, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 185, "data_gap_recovery_extended_stateful": 106, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 80, "inconsistent_api_recovery_stateful": 4}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 1.0, "tool_selection": 0.0, "basic_2step": 1.0, "sequential_3step": 1.0, "conditional_routing": 31.0, "sequential_reasoning": 1.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 1.0, "conditional_routing_stateful": 26.0, "sequential_reasoning_stateful": 1.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 6.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 15.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 33, "argument_transformation": 49, "grounded_synthesis": 29, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 47, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 46, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 43.33, "argument_fidelity": 167.39, "tool_selection": 1.81, "basic_2step": 338.95, "sequential_3step": 542.59, "conditional_routing": 450.82, "sequential_reasoning": 332.46, "error_recovery": 0.0, "data_gap_recovery": 383.46, "data_gap_recovery_extended": 291.55, "argument_transformation": 1087.66, "grounded_synthesis": 594.92, "inconsistent_api_recovery": 612.19, "relevance_detection_stateful": 47.18, "argument_fidelity_stateful": 233.27, "tool_selection_stateful": 4.0, "basic_2step_stateful": 182.56, "sequential_3step_stateful": 556.96, "conditional_routing_stateful": 433.81, "sequential_reasoning_stateful": 308.72, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 383.72, "data_gap_recovery_extended_stateful": 383.58, "argument_transformation_stateful": 1171.86, "grounded_synthesis_stateful": 640.78, "inconsistent_api_recovery_stateful": 705.34}, "scenarioSpeedN": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 33, "argument_transformation": 49, "grounded_synthesis": 29, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 47, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 46, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 50}}, {"label": "Nemotron-3-Nano-30B-A3B-Q4_K_M LS/P [bare:full]", "model": "Nemotron-3-Nano-30B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "nemotron-3-nano", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 58.6, "accuracy": 65.1, "completeness": 90.1, "efficiency": 99.9, "wasted": 0.1, "speed": 10.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 98, "sequential_3step": 98, "conditional_routing": 48, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 88, "data_gap_recovery_extended": 2, "argument_transformation": 4, "grounded_synthesis": 0, "inconsistent_api_recovery": 98, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 94, "sequential_3step_stateful": 96, "conditional_routing_stateful": 6, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 49, "conditional_routing": 24, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 1, "argument_transformation": 2, "grounded_synthesis": 0, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 47, "sequential_3step_stateful": 48, "conditional_routing_stateful": 3, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 48, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 48, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 98, "sequential_3step": 147, "conditional_routing": 96, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 220, "data_gap_recovery_extended": 8, "argument_transformation": 10, "grounded_synthesis": 0, "inconsistent_api_recovery": 392, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 94, "sequential_3step_stateful": 144, "conditional_routing_stateful": 12, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 151, "basic_2step": 99, "sequential_3step": 147, "conditional_routing": 85, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 222, "data_gap_recovery_extended": 7, "argument_transformation": 8, "grounded_synthesis": 0, "inconsistent_api_recovery": 399, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 95, "sequential_3step_stateful": 144, "conditional_routing_stateful": 13, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 223, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 11, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 1.0, "basic_2step": 1.0, "sequential_3step": 0.0, "conditional_routing": 2.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 7.0, "inconsistent_api_recovery": 32.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 1.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 1.0, "inconsistent_api_recovery_stateful": 40.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 48, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 58.29, "argument_fidelity": 124.79, "tool_selection": 117.68, "basic_2step": 258.74, "sequential_3step": 179.96, "conditional_routing": 339.3, "sequential_reasoning": 281.59, "error_recovery": 0.0, "data_gap_recovery": 552.02, "data_gap_recovery_extended": 780.6, "argument_transformation": 753.67, "grounded_synthesis": 617.04, "inconsistent_api_recovery": 1729.27, "relevance_detection_stateful": 68.4, "argument_fidelity_stateful": 119.64, "tool_selection_stateful": 106.44, "basic_2step_stateful": 295.71, "sequential_3step_stateful": 180.8, "conditional_routing_stateful": 334.02, "sequential_reasoning_stateful": 268.68, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 529.17, "data_gap_recovery_extended_stateful": 790.3, "argument_transformation_stateful": 790.39, "grounded_synthesis_stateful": 618.4, "inconsistent_api_recovery_stateful": 1794.3}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 48, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q4_K_M LS/P [bare:keep-last]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "keep-last", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 59.1, "accuracy": 68.0, "completeness": 86.9, "efficiency": 97.7, "wasted": 0.2, "speed": 16.6, "n": 50, "scenarios": {"relevance_detection": 98, "argument_fidelity": 100, "tool_selection": 56, "basic_2step": 98, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 58, "data_gap_recovery_extended": 0, "argument_transformation": 22, "grounded_synthesis": 8, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 68, "basic_2step_stateful": 94, "sequential_3step_stateful": 98, "conditional_routing_stateful": 72, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 72, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 2}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 28, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 29, "data_gap_recovery_extended": 0, "argument_transformation": 11, "grounded_synthesis": 4, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 34, "basic_2step_stateful": 47, "sequential_3step_stateful": 49, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 1}, "scenarioCompleted": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 28, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 40, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 34, "basic_2step_stateful": 47, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 28, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 40, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 34, "basic_2step_stateful": 47, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 84, "basic_2step": 98, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 145, "data_gap_recovery_extended": 0, "argument_transformation": 55, "grounded_synthesis": 40, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 102, "basic_2step_stateful": 94, "sequential_3step_stateful": 147, "conditional_routing_stateful": 144, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 180, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 8}, "scenarioActualCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 84, "basic_2step": 98, "sequential_3step": 150, "conditional_routing": 212, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 133, "data_gap_recovery_extended": 0, "argument_transformation": 44, "grounded_synthesis": 33, "inconsistent_api_recovery": 425, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 102, "basic_2step_stateful": 94, "sequential_3step_stateful": 147, "conditional_routing_stateful": 172, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 172, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 11}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 27.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 58.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 28.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 65.0}, "scenarioWastedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 28, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 40, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 34, "basic_2step_stateful": 47, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 78.74, "argument_fidelity": 190.16, "tool_selection": 131.23, "basic_2step": 158.46, "sequential_3step": 444.48, "conditional_routing": 848.35, "sequential_reasoning": 340.99, "error_recovery": 0.0, "data_gap_recovery": 654.9, "data_gap_recovery_extended": 797.43, "argument_transformation": 1584.63, "grounded_synthesis": 1652.35, "inconsistent_api_recovery": 2456.21, "relevance_detection_stateful": 81.51, "argument_fidelity_stateful": 191.91, "tool_selection_stateful": 161.37, "basic_2step_stateful": 195.81, "sequential_3step_stateful": 416.79, "conditional_routing_stateful": 889.08, "sequential_reasoning_stateful": 353.17, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 658.75, "data_gap_recovery_extended_stateful": 742.39, "argument_transformation_stateful": 1586.22, "grounded_synthesis_stateful": 1667.59, "inconsistent_api_recovery_stateful": 2443.1}, "scenarioSpeedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 28, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 40, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 34, "basic_2step_stateful": 47, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [bare:keep-last]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "keep-last", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 59.2, "accuracy": 80.5, "completeness": 73.5, "efficiency": 100.0, "wasted": 0.4, "speed": 3.1, "n": 50, "scenarios": {"relevance_detection": 44, "argument_fidelity": 88, "tool_selection": 36, "basic_2step": 100, "sequential_3step": 96, "conditional_routing": 96, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 82, "data_gap_recovery_extended": 66, "argument_transformation": 4, "grounded_synthesis": 10, "inconsistent_api_recovery": 82, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 92, "tool_selection_stateful": 38, "basic_2step_stateful": 100, "sequential_3step_stateful": 98, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 92, "data_gap_recovery_extended_stateful": 66, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 22, "argument_fidelity": 44, "tool_selection": 18, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 33, "argument_transformation": 2, "grounded_synthesis": 5, "inconsistent_api_recovery": 41, "relevance_detection_stateful": 17, "argument_fidelity_stateful": 46, "tool_selection_stateful": 19, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 33, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 22, "argument_fidelity": 44, "tool_selection": 18, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 43, "argument_transformation": 18, "grounded_synthesis": 41, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 17, "argument_fidelity_stateful": 46, "tool_selection_stateful": 19, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 48}, "scenarioValidated": {"relevance_detection": 22, "argument_fidelity": 44, "tool_selection": 18, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 43, "argument_transformation": 18, "grounded_synthesis": 41, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 17, "argument_fidelity_stateful": 46, "tool_selection_stateful": 19, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 48}, "scenarioIdealCalls": {"relevance_detection": 22, "argument_fidelity": 132, "tool_selection": 54, "basic_2step": 100, "sequential_3step": 144, "conditional_routing": 192, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 205, "data_gap_recovery_extended": 264, "argument_transformation": 10, "grounded_synthesis": 50, "inconsistent_api_recovery": 328, "relevance_detection_stateful": 17, "argument_fidelity_stateful": 138, "tool_selection_stateful": 57, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 264, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 80, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 22, "argument_fidelity": 132, "tool_selection": 54, "basic_2step": 100, "sequential_3step": 144, "conditional_routing": 231, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 176, "data_gap_recovery_extended": 212, "argument_transformation": 12, "grounded_synthesis": 45, "inconsistent_api_recovery": 387, "relevance_detection_stateful": 17, "argument_fidelity_stateful": 138, "tool_selection_stateful": 57, "basic_2step_stateful": 100, "sequential_3step_stateful": 147, "conditional_routing_stateful": 238, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 208, "data_gap_recovery_extended_stateful": 201, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 87, "inconsistent_api_recovery_stateful": 22}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 42.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 25.0, "grounded_synthesis": 11.0, "inconsistent_api_recovery": 79.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 47.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 5.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 19.0, "grounded_synthesis_stateful": 50.0, "inconsistent_api_recovery_stateful": 66.0}, "scenarioWastedN": {"relevance_detection": 22, "argument_fidelity": 44, "tool_selection": 18, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 43, "argument_transformation": 18, "grounded_synthesis": 41, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 17, "argument_fidelity_stateful": 46, "tool_selection_stateful": 19, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 48}, "scenarioSpeedSum": {"relevance_detection": 12.74, "argument_fidelity": 49.47, "tool_selection": 19.93, "basic_2step": 29.05, "sequential_3step": 62.52, "conditional_routing": 156.74, "sequential_reasoning": 77.34, "error_recovery": 0.0, "data_gap_recovery": 167.63, "data_gap_recovery_extended": 310.55, "argument_transformation": 77.6, "grounded_synthesis": 133.46, "inconsistent_api_recovery": 267.16, "relevance_detection_stateful": 9.6, "argument_fidelity_stateful": 52.43, "tool_selection_stateful": 19.9, "basic_2step_stateful": 32.57, "sequential_3step_stateful": 62.28, "conditional_routing_stateful": 169.61, "sequential_reasoning_stateful": 78.54, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 185.49, "data_gap_recovery_extended_stateful": 335.59, "argument_transformation_stateful": 99.36, "grounded_synthesis_stateful": 268.15, "inconsistent_api_recovery_stateful": 268.06}, "scenarioSpeedN": {"relevance_detection": 22, "argument_fidelity": 44, "tool_selection": 18, "basic_2step": 50, "sequential_3step": 48, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 42, "data_gap_recovery_extended": 43, "argument_transformation": 18, "grounded_synthesis": 41, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 17, "argument_fidelity_stateful": 46, "tool_selection_stateful": 19, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 48}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [bare]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "none", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 58.6, "accuracy": 80.0, "completeness": 73.3, "efficiency": 100.0, "wasted": 0.4, "speed": 3.3, "n": 50, "scenarios": {"relevance_detection": 52, "argument_fidelity": 88, "tool_selection": 32, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 86, "data_gap_recovery_extended": 58, "argument_transformation": 4, "grounded_synthesis": 18, "inconsistent_api_recovery": 64, "relevance_detection_stateful": 36, "argument_fidelity_stateful": 92, "tool_selection_stateful": 48, "basic_2step_stateful": 100, "sequential_3step_stateful": 96, "conditional_routing_stateful": 92, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 90, "data_gap_recovery_extended_stateful": 60, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 2}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 29, "argument_transformation": 2, "grounded_synthesis": 9, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 18, "argument_fidelity_stateful": 46, "tool_selection_stateful": 24, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 30, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 1}, "scenarioCompleted": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 37, "argument_transformation": 13, "grounded_synthesis": 42, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 18, "argument_fidelity_stateful": 46, "tool_selection_stateful": 24, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 41, "argument_transformation_stateful": 23, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 37, "argument_transformation": 13, "grounded_synthesis": 42, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 18, "argument_fidelity_stateful": 46, "tool_selection_stateful": 24, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 41, "argument_transformation_stateful": 23, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 26, "argument_fidelity": 132, "tool_selection": 48, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 215, "data_gap_recovery_extended": 232, "argument_transformation": 10, "grounded_synthesis": 90, "inconsistent_api_recovery": 256, "relevance_detection_stateful": 18, "argument_fidelity_stateful": 138, "tool_selection_stateful": 72, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 184, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 240, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 8}, "scenarioActualCalls": {"relevance_detection": 26, "argument_fidelity": 132, "tool_selection": 48, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 245, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 188, "data_gap_recovery_extended": 176, "argument_transformation": 9, "grounded_synthesis": 55, "inconsistent_api_recovery": 293, "relevance_detection_stateful": 18, "argument_fidelity_stateful": 138, "tool_selection_stateful": 72, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 228, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 199, "data_gap_recovery_extended_stateful": 182, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 12}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 46.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 27.0, "grounded_synthesis": 11.0, "inconsistent_api_recovery": 60.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 45.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 46.0, "grounded_synthesis_stateful": 15.0, "inconsistent_api_recovery_stateful": 79.0}, "scenarioWastedN": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 37, "argument_transformation": 13, "grounded_synthesis": 42, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 18, "argument_fidelity_stateful": 46, "tool_selection_stateful": 24, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 41, "argument_transformation_stateful": 23, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 15.24, "argument_fidelity": 51.95, "tool_selection": 17.82, "basic_2step": 30.08, "sequential_3step": 63.93, "conditional_routing": 170.48, "sequential_reasoning": 79.52, "error_recovery": 0.0, "data_gap_recovery": 197.33, "data_gap_recovery_extended": 312.48, "argument_transformation": 70.53, "grounded_synthesis": 256.61, "inconsistent_api_recovery": 260.23, "relevance_detection_stateful": 8.93, "argument_fidelity_stateful": 53.49, "tool_selection_stateful": 26.24, "basic_2step_stateful": 33.86, "sequential_3step_stateful": 63.66, "conditional_routing_stateful": 168.2, "sequential_reasoning_stateful": 80.21, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 196.19, "data_gap_recovery_extended_stateful": 322.12, "argument_transformation_stateful": 156.09, "grounded_synthesis_stateful": 183.54, "inconsistent_api_recovery_stateful": 290.04}, "scenarioSpeedN": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 16, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 37, "argument_transformation": 13, "grounded_synthesis": 42, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 18, "argument_fidelity_stateful": 46, "tool_selection_stateful": 24, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 41, "argument_transformation_stateful": 23, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite4.1:8b-q4_K_M OL/N [reforged:full]", "model": "granite4.1:8b-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 57.8, "accuracy": 57.8, "completeness": 100.0, "efficiency": 80.7, "wasted": 1.3, "speed": 1.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 250, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 201, "data_gap_recovery": 6, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 250, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 7, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 100.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 101.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 400.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 150.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 100.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 400.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 150.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 25.69, "argument_fidelity": 49.69, "tool_selection": 73.35, "basic_2step": 27.18, "sequential_3step": 41.51, "conditional_routing": 106.65, "sequential_reasoning": 54.63, "error_recovery": 55.75, "data_gap_recovery": 127.89, "data_gap_recovery_extended": 155.87, "argument_transformation": 188.25, "grounded_synthesis": 172.23, "inconsistent_api_recovery": 180.94, "relevance_detection_stateful": 25.68, "argument_fidelity_stateful": 49.71, "tool_selection_stateful": 73.35, "basic_2step_stateful": 23.04, "sequential_3step_stateful": 41.57, "conditional_routing_stateful": 97.19, "sequential_reasoning_stateful": 54.61, "error_recovery_stateful": 48.48, "data_gap_recovery_stateful": 126.85, "data_gap_recovery_extended_stateful": 155.86, "argument_transformation_stateful": 188.23, "grounded_synthesis_stateful": 172.18, "inconsistent_api_recovery_stateful": 180.86}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q4_K_M LS/P [bare:full]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 57.8, "accuracy": 66.9, "completeness": 86.5, "efficiency": 97.1, "wasted": 0.2, "speed": 17.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 66, "basic_2step": 94, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 54, "data_gap_recovery_extended": 0, "argument_transformation": 24, "grounded_synthesis": 8, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 58, "basic_2step_stateful": 96, "sequential_3step_stateful": 98, "conditional_routing_stateful": 68, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 33, "basic_2step": 47, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 27, "data_gap_recovery_extended": 0, "argument_transformation": 12, "grounded_synthesis": 4, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 29, "basic_2step_stateful": 48, "sequential_3step_stateful": 49, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 33, "basic_2step": 47, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 42, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 29, "basic_2step_stateful": 48, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 33, "basic_2step": 47, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 42, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 29, "basic_2step_stateful": 48, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 99, "basic_2step": 94, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 135, "data_gap_recovery_extended": 0, "argument_transformation": 60, "grounded_synthesis": 40, "inconsistent_api_recovery": 384, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 87, "basic_2step_stateful": 96, "sequential_3step_stateful": 147, "conditional_routing_stateful": 136, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 110, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 99, "basic_2step": 94, "sequential_3step": 150, "conditional_routing": 218, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 123, "data_gap_recovery_extended": 0, "argument_transformation": 48, "grounded_synthesis": 30, "inconsistent_api_recovery": 452, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 87, "basic_2step_stateful": 96, "sequential_3step_stateful": 147, "conditional_routing_stateful": 165, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 99, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 29.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 6.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 69.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 29.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 65.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 33, "basic_2step": 47, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 42, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 29, "basic_2step_stateful": 48, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 73.69, "argument_fidelity": 196.33, "tool_selection": 158.55, "basic_2step": 165.7, "sequential_3step": 446.02, "conditional_routing": 877.1, "sequential_reasoning": 354.58, "error_recovery": 0.0, "data_gap_recovery": 603.12, "data_gap_recovery_extended": 837.07, "argument_transformation": 1670.83, "grounded_synthesis": 1715.35, "inconsistent_api_recovery": 2612.16, "relevance_detection_stateful": 76.88, "argument_fidelity_stateful": 194.79, "tool_selection_stateful": 151.23, "basic_2step_stateful": 221.69, "sequential_3step_stateful": 428.79, "conditional_routing_stateful": 880.55, "sequential_reasoning_stateful": 347.2, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 659.73, "data_gap_recovery_extended_stateful": 734.26, "argument_transformation_stateful": 1689.59, "grounded_synthesis_stateful": 1749.51, "inconsistent_api_recovery_stateful": 2778.66}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 33, "basic_2step": 47, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 42, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 29, "basic_2step_stateful": 48, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "phi-4-Q4_K_M LS/P [bare]", "model": "phi-4-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "none", "family": "phi-4", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 57.9, "accuracy": 68.0, "completeness": 85.2, "efficiency": 91.1, "wasted": 0.5, "speed": 3.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 98, "sequential_3step": 100, "conditional_routing": 22, "sequential_reasoning": 58, "error_recovery": 0, "data_gap_recovery": 80, "data_gap_recovery_extended": 28, "argument_transformation": 20, "grounded_synthesis": 16, "inconsistent_api_recovery": 54, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 98, "sequential_3step_stateful": 100, "conditional_routing_stateful": 14, "sequential_reasoning_stateful": 68, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 82, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 2}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 49, "sequential_3step": 50, "conditional_routing": 11, "sequential_reasoning": 29, "error_recovery": 0, "data_gap_recovery": 40, "data_gap_recovery_extended": 14, "argument_transformation": 10, "grounded_synthesis": 8, "inconsistent_api_recovery": 27, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 49, "sequential_3step_stateful": 50, "conditional_routing_stateful": 7, "sequential_reasoning_stateful": 34, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 41, "data_gap_recovery_extended_stateful": 18, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 14, "inconsistent_api_recovery_stateful": 1}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 36, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 44, "argument_transformation": 48, "grounded_synthesis": 32, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 45, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 48}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 36, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 44, "argument_transformation": 48, "grounded_synthesis": 32, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 45, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 48}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 98, "sequential_3step": 150, "conditional_routing": 44, "sequential_reasoning": 116, "error_recovery": 0, "data_gap_recovery": 200, "data_gap_recovery_extended": 112, "argument_transformation": 50, "grounded_synthesis": 80, "inconsistent_api_recovery": 216, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 98, "sequential_3step_stateful": 150, "conditional_routing_stateful": 28, "sequential_reasoning_stateful": 136, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 205, "data_gap_recovery_extended_stateful": 144, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 140, "inconsistent_api_recovery_stateful": 8}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 91, "sequential_3step": 150, "conditional_routing": 52, "sequential_reasoning": 116, "error_recovery": 0, "data_gap_recovery": 245, "data_gap_recovery_extended": 88, "argument_transformation": 32, "grounded_synthesis": 144, "inconsistent_api_recovery": 266, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 95, "sequential_3step_stateful": 150, "conditional_routing_stateful": 35, "sequential_reasoning_stateful": 136, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 252, "data_gap_recovery_extended_stateful": 122, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 252, "inconsistent_api_recovery_stateful": 11}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 8.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 49.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 0.0, "grounded_synthesis": 136.0, "inconsistent_api_recovery": 55.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 7.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 50.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 176.0, "inconsistent_api_recovery_stateful": 41.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 36, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 44, "argument_transformation": 48, "grounded_synthesis": 32, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 45, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 48}, "scenarioSpeedSum": {"relevance_detection": 22.68, "argument_fidelity": 54.91, "tool_selection": 45.8, "basic_2step": 34.45, "sequential_3step": 49.97, "conditional_routing": 158.61, "sequential_reasoning": 78.24, "error_recovery": 0.0, "data_gap_recovery": 166.0, "data_gap_recovery_extended": 217.37, "argument_transformation": 543.83, "grounded_synthesis": 290.67, "inconsistent_api_recovery": 232.33, "relevance_detection_stateful": 23.12, "argument_fidelity_stateful": 55.29, "tool_selection_stateful": 45.2, "basic_2step_stateful": 42.3, "sequential_3step_stateful": 50.0, "conditional_routing_stateful": 165.32, "sequential_reasoning_stateful": 79.23, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 181.76, "data_gap_recovery_extended_stateful": 200.26, "argument_transformation_stateful": 551.59, "grounded_synthesis_stateful": 298.34, "inconsistent_api_recovery_stateful": 207.68}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 36, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 44, "argument_transformation": 48, "grounded_synthesis": 32, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 45, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 48}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [bare:full]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 58.5, "accuracy": 79.5, "completeness": 73.5, "efficiency": 100.0, "wasted": 0.3, "speed": 3.5, "n": 50, "scenarios": {"relevance_detection": 26, "argument_fidelity": 98, "tool_selection": 40, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 96, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 92, "data_gap_recovery_extended": 54, "argument_transformation": 6, "grounded_synthesis": 14, "inconsistent_api_recovery": 86, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 92, "tool_selection_stateful": 38, "basic_2step_stateful": 100, "sequential_3step_stateful": 96, "conditional_routing_stateful": 92, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 92, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 13, "argument_fidelity": 49, "tool_selection": 20, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 46, "data_gap_recovery_extended": 27, "argument_transformation": 3, "grounded_synthesis": 7, "inconsistent_api_recovery": 43, "relevance_detection_stateful": 22, "argument_fidelity_stateful": 46, "tool_selection_stateful": 19, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 22, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 13, "argument_fidelity": 49, "tool_selection": 20, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 42, "argument_transformation": 14, "grounded_synthesis": 40, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 22, "argument_fidelity_stateful": 46, "tool_selection_stateful": 19, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 42, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 13, "argument_fidelity": 49, "tool_selection": 20, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 42, "argument_transformation": 14, "grounded_synthesis": 40, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 22, "argument_fidelity_stateful": 46, "tool_selection_stateful": 19, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 42, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 13, "argument_fidelity": 147, "tool_selection": 60, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 192, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 230, "data_gap_recovery_extended": 216, "argument_transformation": 15, "grounded_synthesis": 70, "inconsistent_api_recovery": 344, "relevance_detection_stateful": 22, "argument_fidelity_stateful": 138, "tool_selection_stateful": 57, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 184, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 176, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 13, "argument_fidelity": 147, "tool_selection": 60, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 237, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 204, "data_gap_recovery_extended": 168, "argument_transformation": 21, "grounded_synthesis": 35, "inconsistent_api_recovery": 398, "relevance_detection_stateful": 22, "argument_fidelity_stateful": 138, "tool_selection_stateful": 57, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 227, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 198, "data_gap_recovery_extended_stateful": 140, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 23}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 45.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 28.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 75.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 44.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 2.0, "argument_transformation_stateful": 22.0, "grounded_synthesis_stateful": 11.0, "inconsistent_api_recovery_stateful": 61.0}, "scenarioWastedN": {"relevance_detection": 13, "argument_fidelity": 49, "tool_selection": 20, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 42, "argument_transformation": 14, "grounded_synthesis": 40, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 22, "argument_fidelity_stateful": 46, "tool_selection_stateful": 19, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 42, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 7.09, "argument_fidelity": 57.92, "tool_selection": 21.86, "basic_2step": 30.2, "sequential_3step": 66.23, "conditional_routing": 176.89, "sequential_reasoning": 78.97, "error_recovery": 0.0, "data_gap_recovery": 191.79, "data_gap_recovery_extended": 334.94, "argument_transformation": 85.65, "grounded_synthesis": 330.96, "inconsistent_api_recovery": 273.88, "relevance_detection_stateful": 12.31, "argument_fidelity_stateful": 54.03, "tool_selection_stateful": 21.54, "basic_2step_stateful": 33.74, "sequential_3step_stateful": 62.48, "conditional_routing_stateful": 172.39, "sequential_reasoning_stateful": 79.72, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 186.63, "data_gap_recovery_extended_stateful": 345.77, "argument_transformation_stateful": 107.44, "grounded_synthesis_stateful": 284.75, "inconsistent_api_recovery_stateful": 283.57}, "scenarioSpeedN": {"relevance_detection": 13, "argument_fidelity": 49, "tool_selection": 20, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 42, "argument_transformation": 14, "grounded_synthesis": 40, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 22, "argument_fidelity_stateful": 46, "tool_selection_stateful": 19, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 42, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 49}}, {"label": "Qwen3-8B-Q4_K_M LS/P [bare]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "none", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 57.4, "accuracy": 66.6, "completeness": 86.2, "efficiency": 96.8, "wasted": 0.2, "speed": 17.2, "n": 50, "scenarios": {"relevance_detection": 98, "argument_fidelity": 100, "tool_selection": 56, "basic_2step": 96, "sequential_3step": 100, "conditional_routing": 90, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 62, "data_gap_recovery_extended": 0, "argument_transformation": 22, "grounded_synthesis": 6, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 48, "basic_2step_stateful": 94, "sequential_3step_stateful": 100, "conditional_routing_stateful": 56, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 64, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 2}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 28, "basic_2step": 48, "sequential_3step": 50, "conditional_routing": 45, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 31, "data_gap_recovery_extended": 0, "argument_transformation": 11, "grounded_synthesis": 3, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 24, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 28, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 32, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 1}, "scenarioCompleted": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 28, "basic_2step": 48, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 37, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 24, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 28, "basic_2step": 48, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 37, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 24, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 84, "basic_2step": 96, "sequential_3step": 150, "conditional_routing": 180, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 155, "data_gap_recovery_extended": 0, "argument_transformation": 55, "grounded_synthesis": 30, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 72, "basic_2step_stateful": 94, "sequential_3step_stateful": 150, "conditional_routing_stateful": 112, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 160, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 8}, "scenarioActualCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 84, "basic_2step": 96, "sequential_3step": 150, "conditional_routing": 209, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 144, "data_gap_recovery_extended": 0, "argument_transformation": 45, "grounded_synthesis": 24, "inconsistent_api_recovery": 447, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 72, "basic_2step_stateful": 94, "sequential_3step_stateful": 150, "conditional_routing_stateful": 132, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 149, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 17, "inconsistent_api_recovery_stateful": 12}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 29.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 75.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 20.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 4.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 70.0}, "scenarioWastedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 28, "basic_2step": 48, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 37, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 24, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 76.7, "argument_fidelity": 203.89, "tool_selection": 136.34, "basic_2step": 177.25, "sequential_3step": 436.96, "conditional_routing": 887.79, "sequential_reasoning": 356.92, "error_recovery": 0.0, "data_gap_recovery": 673.36, "data_gap_recovery_extended": 732.56, "argument_transformation": 1769.26, "grounded_synthesis": 1588.84, "inconsistent_api_recovery": 2691.73, "relevance_detection_stateful": 84.67, "argument_fidelity_stateful": 192.53, "tool_selection_stateful": 117.19, "basic_2step_stateful": 221.67, "sequential_3step_stateful": 431.15, "conditional_routing_stateful": 852.38, "sequential_reasoning_stateful": 346.29, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 687.42, "data_gap_recovery_extended_stateful": 752.09, "argument_transformation_stateful": 1653.42, "grounded_synthesis_stateful": 1651.65, "inconsistent_api_recovery_stateful": 2586.35}, "scenarioSpeedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 28, "basic_2step": 48, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 37, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 24, "basic_2step_stateful": 47, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [bare]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "none", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 55.2, "accuracy": 79.8, "completeness": 69.2, "efficiency": 100.0, "wasted": 0.4, "speed": 2.3, "n": 50, "scenarios": {"relevance_detection": 58, "argument_fidelity": 70, "tool_selection": 14, "basic_2step": 100, "sequential_3step": 92, "conditional_routing": 96, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 72, "data_gap_recovery_extended": 48, "argument_transformation": 4, "grounded_synthesis": 16, "inconsistent_api_recovery": 66, "relevance_detection_stateful": 58, "argument_fidelity_stateful": 84, "tool_selection_stateful": 10, "basic_2step_stateful": 100, "sequential_3step_stateful": 92, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 88, "data_gap_recovery_extended_stateful": 60, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 2}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 29, "argument_fidelity": 35, "tool_selection": 7, "basic_2step": 50, "sequential_3step": 46, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 24, "argument_transformation": 2, "grounded_synthesis": 8, "inconsistent_api_recovery": 33, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 42, "tool_selection_stateful": 5, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 44, "data_gap_recovery_extended_stateful": 30, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 1}, "scenarioCompleted": {"relevance_detection": 29, "argument_fidelity": 35, "tool_selection": 7, "basic_2step": 50, "sequential_3step": 46, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 32, "argument_transformation": 18, "grounded_synthesis": 32, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 42, "tool_selection_stateful": 5, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 29, "argument_fidelity": 35, "tool_selection": 7, "basic_2step": 50, "sequential_3step": 46, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 32, "argument_transformation": 18, "grounded_synthesis": 32, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 42, "tool_selection_stateful": 5, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 29, "argument_fidelity": 105, "tool_selection": 21, "basic_2step": 100, "sequential_3step": 138, "conditional_routing": 192, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 180, "data_gap_recovery_extended": 192, "argument_transformation": 10, "grounded_synthesis": 80, "inconsistent_api_recovery": 264, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 126, "tool_selection_stateful": 15, "basic_2step_stateful": 100, "sequential_3step_stateful": 138, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 220, "data_gap_recovery_extended_stateful": 240, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 60, "inconsistent_api_recovery_stateful": 8}, "scenarioActualCalls": {"relevance_detection": 29, "argument_fidelity": 105, "tool_selection": 21, "basic_2step": 100, "sequential_3step": 138, "conditional_routing": 229, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 165, "data_gap_recovery_extended": 156, "argument_transformation": 8, "grounded_synthesis": 71, "inconsistent_api_recovery": 313, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 126, "tool_selection_stateful": 15, "basic_2step_stateful": 100, "sequential_3step_stateful": 138, "conditional_routing_stateful": 219, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 205, "data_gap_recovery_extended_stateful": 187, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 54, "inconsistent_api_recovery_stateful": 12}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 41.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 5.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 25.0, "grounded_synthesis": 27.0, "inconsistent_api_recovery": 81.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 37.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 11.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 28.0, "grounded_synthesis_stateful": 49.0, "inconsistent_api_recovery_stateful": 71.0}, "scenarioWastedN": {"relevance_detection": 29, "argument_fidelity": 35, "tool_selection": 7, "basic_2step": 50, "sequential_3step": 46, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 32, "argument_transformation": 18, "grounded_synthesis": 32, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 42, "tool_selection_stateful": 5, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 10.47, "argument_fidelity": 26.45, "tool_selection": 5.12, "basic_2step": 19.04, "sequential_3step": 36.77, "conditional_routing": 113.82, "sequential_reasoning": 49.23, "error_recovery": 0.0, "data_gap_recovery": 103.55, "data_gap_recovery_extended": 160.01, "argument_transformation": 94.11, "grounded_synthesis": 201.08, "inconsistent_api_recovery": 180.92, "relevance_detection_stateful": 9.86, "argument_fidelity_stateful": 33.05, "tool_selection_stateful": 3.5, "basic_2step_stateful": 21.18, "sequential_3step_stateful": 36.79, "conditional_routing_stateful": 109.99, "sequential_reasoning_stateful": 50.91, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 128.33, "data_gap_recovery_extended_stateful": 182.5, "argument_transformation_stateful": 108.39, "grounded_synthesis_stateful": 215.78, "inconsistent_api_recovery_stateful": 190.04}, "scenarioSpeedN": {"relevance_detection": 29, "argument_fidelity": 35, "tool_selection": 7, "basic_2step": 50, "sequential_3step": 46, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 32, "argument_transformation": 18, "grounded_synthesis": 32, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 42, "tool_selection_stateful": 5, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 49}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [bare:full]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 54.6, "accuracy": 78.7, "completeness": 69.4, "efficiency": 100.0, "wasted": 0.4, "speed": 2.4, "n": 50, "scenarios": {"relevance_detection": 58, "argument_fidelity": 74, "tool_selection": 4, "basic_2step": 100, "sequential_3step": 86, "conditional_routing": 94, "sequential_reasoning": 98, "error_recovery": 0, "data_gap_recovery": 76, "data_gap_recovery_extended": 54, "argument_transformation": 2, "grounded_synthesis": 12, "inconsistent_api_recovery": 62, "relevance_detection_stateful": 66, "argument_fidelity_stateful": 78, "tool_selection_stateful": 8, "basic_2step_stateful": 100, "sequential_3step_stateful": 94, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 86, "data_gap_recovery_extended_stateful": 54, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 18, "inconsistent_api_recovery_stateful": 2}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 29, "argument_fidelity": 37, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 43, "conditional_routing": 47, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 38, "data_gap_recovery_extended": 27, "argument_transformation": 1, "grounded_synthesis": 6, "inconsistent_api_recovery": 31, "relevance_detection_stateful": 33, "argument_fidelity_stateful": 39, "tool_selection_stateful": 4, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 27, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 9, "inconsistent_api_recovery_stateful": 1}, "scenarioCompleted": {"relevance_detection": 29, "argument_fidelity": 37, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 45, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 34, "argument_transformation": 20, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 33, "argument_fidelity_stateful": 39, "tool_selection_stateful": 4, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 29, "argument_fidelity": 37, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 45, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 34, "argument_transformation": 20, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 33, "argument_fidelity_stateful": 39, "tool_selection_stateful": 4, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 29, "argument_fidelity": 111, "tool_selection": 6, "basic_2step": 100, "sequential_3step": 129, "conditional_routing": 188, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 190, "data_gap_recovery_extended": 216, "argument_transformation": 5, "grounded_synthesis": 60, "inconsistent_api_recovery": 248, "relevance_detection_stateful": 33, "argument_fidelity_stateful": 117, "tool_selection_stateful": 12, "basic_2step_stateful": 100, "sequential_3step_stateful": 141, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 215, "data_gap_recovery_extended_stateful": 216, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 90, "inconsistent_api_recovery_stateful": 8}, "scenarioActualCalls": {"relevance_detection": 29, "argument_fidelity": 111, "tool_selection": 6, "basic_2step": 100, "sequential_3step": 129, "conditional_routing": 228, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 177, "data_gap_recovery_extended": 168, "argument_transformation": 4, "grounded_synthesis": 41, "inconsistent_api_recovery": 288, "relevance_detection_stateful": 33, "argument_fidelity_stateful": 117, "tool_selection_stateful": 12, "basic_2step_stateful": 100, "sequential_3step_stateful": 141, "conditional_routing_stateful": 224, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 191, "data_gap_recovery_extended_stateful": 169, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 77, "inconsistent_api_recovery_stateful": 13}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 43.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 6.0, "data_gap_recovery_extended": 3.0, "argument_transformation": 43.0, "grounded_synthesis": 18.0, "inconsistent_api_recovery": 74.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 40.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 5.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 29.0, "grounded_synthesis_stateful": 54.0, "inconsistent_api_recovery_stateful": 82.0}, "scenarioWastedN": {"relevance_detection": 29, "argument_fidelity": 37, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 45, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 34, "argument_transformation": 20, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 33, "argument_fidelity_stateful": 39, "tool_selection_stateful": 4, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 9.81, "argument_fidelity": 29.63, "tool_selection": 1.26, "basic_2step": 19.04, "sequential_3step": 36.5, "conditional_routing": 102.51, "sequential_reasoning": 49.46, "error_recovery": 0.0, "data_gap_recovery": 110.51, "data_gap_recovery_extended": 177.37, "argument_transformation": 114.85, "grounded_synthesis": 224.0, "inconsistent_api_recovery": 188.64, "relevance_detection_stateful": 10.81, "argument_fidelity_stateful": 30.34, "tool_selection_stateful": 3.49, "basic_2step_stateful": 21.08, "sequential_3step_stateful": 40.2, "conditional_routing_stateful": 113.52, "sequential_reasoning_stateful": 49.8, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 132.74, "data_gap_recovery_extended_stateful": 178.54, "argument_transformation_stateful": 123.7, "grounded_synthesis_stateful": 223.36, "inconsistent_api_recovery_stateful": 180.14}, "scenarioSpeedN": {"relevance_detection": 29, "argument_fidelity": 37, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 45, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 34, "argument_transformation": 20, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 33, "argument_fidelity_stateful": 39, "tool_selection_stateful": 4, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 39, "inconsistent_api_recovery_stateful": 50}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q8_0 LS/P [reforged:full]", "model": "Meta-Llama-3.1-8B-Instruct-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 54.2, "accuracy": 59.5, "completeness": 91.1, "efficiency": 74.7, "wasted": 2.0, "speed": 3.1, "n": 50, "scenarios": {"relevance_detection": 98, "argument_fidelity": 92, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 84, "sequential_reasoning": 0, "error_recovery": 24, "data_gap_recovery": 38, "data_gap_recovery_extended": 20, "argument_transformation": 0, "grounded_synthesis": 34, "inconsistent_api_recovery": 62, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 88, "tool_selection_stateful": 98, "basic_2step_stateful": 100, "sequential_3step_stateful": 32, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 16, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 18, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 48}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 49, "argument_fidelity": 46, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 42, "sequential_reasoning": 0, "error_recovery": 12, "data_gap_recovery": 19, "data_gap_recovery_extended": 10, "argument_transformation": 0, "grounded_synthesis": 17, "inconsistent_api_recovery": 31, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 44, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 16, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 8, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 15, "inconsistent_api_recovery_stateful": 24}, "scenarioCompleted": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 11, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 17, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 11, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 17, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 49, "argument_fidelity": 138, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 168, "sequential_reasoning": 0, "error_recovery": 24, "data_gap_recovery": 95, "data_gap_recovery_extended": 80, "argument_transformation": 0, "grounded_synthesis": 170, "inconsistent_api_recovery": 248, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 132, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 48, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 24, "data_gap_recovery_stateful": 75, "data_gap_recovery_extended_stateful": 72, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 192}, "scenarioActualCalls": {"relevance_detection": 56, "argument_fidelity": 141, "tool_selection": 124, "basic_2step": 106, "sequential_3step": 206, "conditional_routing": 251, "sequential_reasoning": 0, "error_recovery": 75, "data_gap_recovery": 80, "data_gap_recovery_extended": 64, "argument_transformation": 0, "grounded_synthesis": 291, "inconsistent_api_recovery": 429, "relevance_detection_stateful": 55, "argument_fidelity_stateful": 134, "tool_selection_stateful": 161, "basic_2step_stateful": 112, "sequential_3step_stateful": 63, "conditional_routing_stateful": 280, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 60, "data_gap_recovery_stateful": 70, "data_gap_recovery_extended_stateful": 55, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 261, "inconsistent_api_recovery_stateful": 344}, "scenarioWastedSum": {"relevance_detection": 7.0, "argument_fidelity": 3.0, "tool_selection": 11.0, "basic_2step": 6.0, "sequential_3step": 56.0, "conditional_routing": 100.0, "sequential_reasoning": 36.0, "error_recovery": 306.0, "data_gap_recovery": 25.0, "data_gap_recovery_extended": 17.0, "argument_transformation": 45.0, "grounded_synthesis": 339.0, "inconsistent_api_recovery": 308.0, "relevance_detection_stateful": 5.0, "argument_fidelity_stateful": 2.0, "tool_selection_stateful": 14.0, "basic_2step_stateful": 12.0, "sequential_3step_stateful": 26.0, "conditional_routing_stateful": 92.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 218.0, "data_gap_recovery_stateful": 21.0, "data_gap_recovery_extended_stateful": 6.0, "argument_transformation_stateful": 48.0, "grounded_synthesis_stateful": 338.0, "inconsistent_api_recovery_stateful": 300.0}, "scenarioWastedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 11, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 17, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 16.34, "argument_fidelity": 48.98, "tool_selection": 52.51, "basic_2step": 30.74, "sequential_3step": 77.1, "conditional_routing": 139.17, "sequential_reasoning": 73.98, "error_recovery": 89.76, "data_gap_recovery": 158.52, "data_gap_recovery_extended": 189.64, "argument_transformation": 156.79, "grounded_synthesis": 464.56, "inconsistent_api_recovery": 371.67, "relevance_detection_stateful": 15.59, "argument_fidelity_stateful": 49.85, "tool_selection_stateful": 63.87, "basic_2step_stateful": 36.81, "sequential_3step_stateful": 26.37, "conditional_routing_stateful": 131.04, "sequential_reasoning_stateful": 79.23, "error_recovery_stateful": 83.12, "data_gap_recovery_stateful": 148.06, "data_gap_recovery_extended_stateful": 167.06, "argument_transformation_stateful": 139.25, "grounded_synthesis_stateful": 463.7, "inconsistent_api_recovery_stateful": 373.2}, "scenarioSpeedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 49, "argument_transformation": 11, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 17, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-Nemo-Instruct-2407.Q4_K_M LF/P [bare:full]", "model": "Mistral-Nemo-Instruct-2407.Q4_K_M", "backend": "llamafile", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 53.5, "accuracy": 60.6, "completeness": 88.3, "efficiency": 100.0, "wasted": 0.3, "speed": 3.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 96, "sequential_reasoning": 92, "error_recovery": 0, "data_gap_recovery": 14, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 56, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 54, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 86, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 28, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 27, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 24, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 22, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 46, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 46, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 184, "error_recovery": 0, "data_gap_recovery": 35, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 280, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 81, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 172, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 220, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 132, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 182, "sequential_reasoning": 179, "error_recovery": 0, "data_gap_recovery": 28, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 168, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 81, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 86, "sequential_reasoning_stateful": 170, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 132, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 2.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 3.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 165.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 4.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 13.0, "data_gap_recovery_extended_stateful": 4.0, "argument_transformation_stateful": 172.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 11.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 46, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 37.18, "argument_fidelity": 99.63, "tool_selection": 88.41, "basic_2step": 64.63, "sequential_3step": 96.22, "conditional_routing": 205.88, "sequential_reasoning": 145.82, "error_recovery": 0.0, "data_gap_recovery": 168.02, "data_gap_recovery_extended": 258.73, "argument_transformation": 365.86, "grounded_synthesis": 434.01, "inconsistent_api_recovery": 237.27, "relevance_detection_stateful": 37.86, "argument_fidelity_stateful": 99.84, "tool_selection_stateful": 82.77, "basic_2step_stateful": 68.97, "sequential_3step_stateful": 95.46, "conditional_routing_stateful": 221.47, "sequential_reasoning_stateful": 142.03, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 170.87, "data_gap_recovery_extended_stateful": 237.37, "argument_transformation_stateful": 384.68, "grounded_synthesis_stateful": 310.99, "inconsistent_api_recovery_stateful": 241.59}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 46, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-14B-Q4_K_M LS/P [bare]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "none", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 54.1, "accuracy": 63.2, "completeness": 85.6, "efficiency": 94.8, "wasted": 0.2, "speed": 22.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 14, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 92, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 64, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 34, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 20, "basic_2step_stateful": 88, "sequential_3step_stateful": 98, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 68, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 16}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 7, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 46, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 32, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 17, "inconsistent_api_recovery": 23, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 10, "basic_2step_stateful": 44, "sequential_3step_stateful": 49, "conditional_routing_stateful": 25, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 34, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 8}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 7, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 10, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 7, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 10, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 21, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 184, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 160, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 170, "inconsistent_api_recovery": 184, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 30, "basic_2step_stateful": 88, "sequential_3step_stateful": 147, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 170, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 80, "inconsistent_api_recovery_stateful": 64}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 21, "basic_2step": 88, "sequential_3step": 150, "conditional_routing": 205, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 151, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 222, "inconsistent_api_recovery": 199, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 30, "basic_2step_stateful": 88, "sequential_3step_stateful": 147, "conditional_routing_stateful": 124, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 158, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 109, "inconsistent_api_recovery_stateful": 90}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 24.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 76.0, "inconsistent_api_recovery": 27.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 24.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 56.0, "inconsistent_api_recovery_stateful": 42.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 7, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 10, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 78.65, "argument_fidelity": 316.35, "tool_selection": 43.28, "basic_2step": 126.59, "sequential_3step": 468.07, "conditional_routing": 1083.41, "sequential_reasoning": 464.4, "error_recovery": 0.0, "data_gap_recovery": 858.62, "data_gap_recovery_extended": 1199.97, "argument_transformation": 2508.16, "grounded_synthesis": 3375.63, "inconsistent_api_recovery": 1836.06, "relevance_detection_stateful": 77.85, "argument_fidelity_stateful": 312.29, "tool_selection_stateful": 67.08, "basic_2step_stateful": 146.93, "sequential_3step_stateful": 486.3, "conditional_routing_stateful": 1044.57, "sequential_reasoning_stateful": 476.62, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 766.42, "data_gap_recovery_extended_stateful": 1235.44, "argument_transformation_stateful": 2470.35, "grounded_synthesis_stateful": 3305.05, "inconsistent_api_recovery_stateful": 2039.49}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 7, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 10, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 49}}, {"label": "Qwen3-14B-Q4_K_M LS/P [bare:full]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 53.9, "accuracy": 62.9, "completeness": 85.8, "efficiency": 94.4, "wasted": 0.2, "speed": 24.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 22, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 90, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 66, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 30, "inconsistent_api_recovery": 56, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 14, "basic_2step_stateful": 76, "sequential_3step_stateful": 100, "conditional_routing_stateful": 30, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 70, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 36, "inconsistent_api_recovery_stateful": 12}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 11, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 45, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 33, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 15, "inconsistent_api_recovery": 28, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 7, "basic_2step_stateful": 38, "sequential_3step_stateful": 50, "conditional_routing_stateful": 15, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 35, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 18, "inconsistent_api_recovery_stateful": 6}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 11, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 7, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 11, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 7, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 33, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 180, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 165, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 150, "inconsistent_api_recovery": 224, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 21, "basic_2step_stateful": 76, "sequential_3step_stateful": 150, "conditional_routing_stateful": 60, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 175, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 180, "inconsistent_api_recovery_stateful": 48}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 33, "basic_2step": 92, "sequential_3step": 150, "conditional_routing": 190, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 155, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 191, "inconsistent_api_recovery": 248, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 21, "basic_2step_stateful": 76, "sequential_3step_stateful": 150, "conditional_routing_stateful": 74, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 168, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 245, "inconsistent_api_recovery_stateful": 67}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 15.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 65.0, "inconsistent_api_recovery": 37.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 14.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 105.0, "inconsistent_api_recovery_stateful": 29.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 11, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 7, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 81.99, "argument_fidelity": 317.24, "tool_selection": 74.77, "basic_2step": 147.22, "sequential_3step": 489.22, "conditional_routing": 1015.99, "sequential_reasoning": 494.71, "error_recovery": 0.0, "data_gap_recovery": 876.55, "data_gap_recovery_extended": 1257.75, "argument_transformation": 2795.28, "grounded_synthesis": 3821.35, "inconsistent_api_recovery": 2166.74, "relevance_detection_stateful": 83.09, "argument_fidelity_stateful": 332.78, "tool_selection_stateful": 47.66, "basic_2step_stateful": 140.19, "sequential_3step_stateful": 525.35, "conditional_routing_stateful": 996.91, "sequential_reasoning_stateful": 494.12, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 901.68, "data_gap_recovery_extended_stateful": 1330.11, "argument_transformation_stateful": 2678.05, "grounded_synthesis_stateful": 4317.59, "inconsistent_api_recovery_stateful": 1788.75}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 11, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 7, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-14B-Q4_K_M LS/P [bare:keep-last]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "keep-last", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 53.8, "accuracy": 63.2, "completeness": 85.2, "efficiency": 95.4, "wasted": 0.2, "speed": 23.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 20, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 60, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 30, "inconsistent_api_recovery": 56, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 18, "basic_2step_stateful": 88, "sequential_3step_stateful": 98, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 62, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 10}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 10, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 30, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 15, "inconsistent_api_recovery": 28, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 9, "basic_2step_stateful": 44, "sequential_3step_stateful": 49, "conditional_routing_stateful": 19, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 31, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 5}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 10, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 9, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 10, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 9, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 30, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 150, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 150, "inconsistent_api_recovery": 224, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 27, "basic_2step_stateful": 88, "sequential_3step_stateful": 147, "conditional_routing_stateful": 76, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 155, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 120, "inconsistent_api_recovery_stateful": 40}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 30, "basic_2step": 88, "sequential_3step": 150, "conditional_routing": 198, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 145, "data_gap_recovery_extended": 0, "argument_transformation": 3, "grounded_synthesis": 190, "inconsistent_api_recovery": 241, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 27, "basic_2step_stateful": 88, "sequential_3step_stateful": 147, "conditional_routing_stateful": 94, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 146, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 165, "inconsistent_api_recovery_stateful": 56}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 16.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 63.0, "inconsistent_api_recovery": 32.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 18.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 82.0, "inconsistent_api_recovery_stateful": 22.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 10, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 9, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioSpeedSum": {"relevance_detection": 80.25, "argument_fidelity": 319.47, "tool_selection": 68.86, "basic_2step": 125.95, "sequential_3step": 483.09, "conditional_routing": 1015.72, "sequential_reasoning": 477.54, "error_recovery": 0.0, "data_gap_recovery": 825.28, "data_gap_recovery_extended": 1308.13, "argument_transformation": 2387.35, "grounded_synthesis": 3729.41, "inconsistent_api_recovery": 2147.69, "relevance_detection_stateful": 79.14, "argument_fidelity_stateful": 313.12, "tool_selection_stateful": 64.67, "basic_2step_stateful": 147.52, "sequential_3step_stateful": 492.22, "conditional_routing_stateful": 1015.29, "sequential_reasoning_stateful": 480.99, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 847.77, "data_gap_recovery_extended_stateful": 1234.41, "argument_transformation_stateful": 2418.35, "grounded_synthesis_stateful": 3889.51, "inconsistent_api_recovery_stateful": 1607.81}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 10, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 9, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}}, {"label": "granite-4.1-8b-Q4_K_M LS/N [bare]", "model": "granite-4.1-8b-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "none", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 53.8, "accuracy": 70.0, "completeness": 76.9, "efficiency": 96.0, "wasted": 0.2, "speed": 1.9, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 50.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 50.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 40.63, "tool_selection": 33.24, "basic_2step": 22.09, "sequential_3step": 34.09, "conditional_routing": 262.87, "sequential_reasoning": 52.15, "error_recovery": 0.0, "data_gap_recovery": 128.65, "data_gap_recovery_extended": 166.32, "argument_transformation": 0.0, "grounded_synthesis": 158.55, "inconsistent_api_recovery": 127.17, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 41.75, "tool_selection_stateful": 33.84, "basic_2step_stateful": 24.55, "sequential_3step_stateful": 34.04, "conditional_routing_stateful": 93.19, "sequential_reasoning_stateful": 57.71, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 132.78, "data_gap_recovery_extended_stateful": 170.36, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 144.3, "inconsistent_api_recovery_stateful": 134.56}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.1-8b-Q4_K_M LS/N [bare:keep-last]", "model": "granite-4.1-8b-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "keep-last", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 53.8, "accuracy": 70.0, "completeness": 76.9, "efficiency": 96.0, "wasted": 0.2, "speed": 1.9, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 50.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 50.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 41.63, "tool_selection": 34.15, "basic_2step": 22.6, "sequential_3step": 34.62, "conditional_routing": 268.34, "sequential_reasoning": 53.23, "error_recovery": 0.0, "data_gap_recovery": 131.21, "data_gap_recovery_extended": 169.86, "argument_transformation": 0.0, "grounded_synthesis": 161.94, "inconsistent_api_recovery": 129.79, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 42.76, "tool_selection_stateful": 34.36, "basic_2step_stateful": 25.06, "sequential_3step_stateful": 34.57, "conditional_routing_stateful": 95.38, "sequential_reasoning_stateful": 59.19, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 135.37, "data_gap_recovery_extended_stateful": 173.84, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 147.28, "inconsistent_api_recovery_stateful": 137.38}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.1-8b-Q4_K_M LS/N [bare:full]", "model": "granite-4.1-8b-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 53.8, "accuracy": 70.0, "completeness": 76.9, "efficiency": 96.0, "wasted": 0.2, "speed": 2.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 50.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 50.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 42.63, "tool_selection": 35.16, "basic_2step": 23.11, "sequential_3step": 35.6, "conditional_routing": 275.85, "sequential_reasoning": 54.65, "error_recovery": 0.0, "data_gap_recovery": 135.09, "data_gap_recovery_extended": 174.49, "argument_transformation": 0.0, "grounded_synthesis": 166.4, "inconsistent_api_recovery": 133.17, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 43.77, "tool_selection_stateful": 35.35, "basic_2step_stateful": 26.06, "sequential_3step_stateful": 35.54, "conditional_routing_stateful": 97.95, "sequential_reasoning_stateful": 60.71, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 139.1, "data_gap_recovery_extended_stateful": 178.45, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 151.42, "inconsistent_api_recovery_stateful": 141.08}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [bare:keep-last]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "keep-last", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 54.3, "accuracy": 78.8, "completeness": 68.9, "efficiency": 100.0, "wasted": 0.3, "speed": 2.4, "n": 50, "scenarios": {"relevance_detection": 50, "argument_fidelity": 78, "tool_selection": 4, "basic_2step": 100, "sequential_3step": 92, "conditional_routing": 98, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 68, "data_gap_recovery_extended": 50, "argument_transformation": 10, "grounded_synthesis": 12, "inconsistent_api_recovery": 64, "relevance_detection_stateful": 64, "argument_fidelity_stateful": 72, "tool_selection_stateful": 10, "basic_2step_stateful": 100, "sequential_3step_stateful": 92, "conditional_routing_stateful": 92, "sequential_reasoning_stateful": 96, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 84, "data_gap_recovery_extended_stateful": 62, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 2}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 25, "argument_fidelity": 39, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 46, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 34, "data_gap_recovery_extended": 25, "argument_transformation": 5, "grounded_synthesis": 6, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 36, "tool_selection_stateful": 5, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 31, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 1}, "scenarioCompleted": {"relevance_detection": 25, "argument_fidelity": 39, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 46, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 39, "argument_transformation": 21, "grounded_synthesis": 31, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 36, "tool_selection_stateful": 5, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 25, "argument_fidelity": 39, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 46, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 39, "argument_transformation": 21, "grounded_synthesis": 31, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 36, "tool_selection_stateful": 5, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 25, "argument_fidelity": 117, "tool_selection": 6, "basic_2step": 100, "sequential_3step": 138, "conditional_routing": 196, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 170, "data_gap_recovery_extended": 200, "argument_transformation": 25, "grounded_synthesis": 60, "inconsistent_api_recovery": 256, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 108, "tool_selection_stateful": 15, "basic_2step_stateful": 100, "sequential_3step_stateful": 138, "conditional_routing_stateful": 184, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 210, "data_gap_recovery_extended_stateful": 248, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 60, "inconsistent_api_recovery_stateful": 8}, "scenarioActualCalls": {"relevance_detection": 25, "argument_fidelity": 117, "tool_selection": 6, "basic_2step": 100, "sequential_3step": 138, "conditional_routing": 234, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 154, "data_gap_recovery_extended": 147, "argument_transformation": 25, "grounded_synthesis": 38, "inconsistent_api_recovery": 312, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 108, "tool_selection_stateful": 15, "basic_2step_stateful": 100, "sequential_3step_stateful": 138, "conditional_routing_stateful": 213, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 183, "data_gap_recovery_extended_stateful": 197, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 36, "inconsistent_api_recovery_stateful": 12}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 43.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 20.0, "grounded_synthesis": 20.0, "inconsistent_api_recovery": 77.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 37.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 4.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 5.0, "grounded_synthesis_stateful": 3.0, "inconsistent_api_recovery_stateful": 86.0}, "scenarioWastedN": {"relevance_detection": 25, "argument_fidelity": 39, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 46, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 39, "argument_transformation": 21, "grounded_synthesis": 31, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 36, "tool_selection_stateful": 5, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 9.06, "argument_fidelity": 31.6, "tool_selection": 1.33, "basic_2step": 19.1, "sequential_3step": 39.26, "conditional_routing": 110.62, "sequential_reasoning": 50.56, "error_recovery": 0.0, "data_gap_recovery": 95.31, "data_gap_recovery_extended": 200.59, "argument_transformation": 111.89, "grounded_synthesis": 185.44, "inconsistent_api_recovery": 180.4, "relevance_detection_stateful": 11.58, "argument_fidelity_stateful": 28.75, "tool_selection_stateful": 3.79, "basic_2step_stateful": 21.55, "sequential_3step_stateful": 36.87, "conditional_routing_stateful": 109.96, "sequential_reasoning_stateful": 49.63, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 119.93, "data_gap_recovery_extended_stateful": 191.37, "argument_transformation_stateful": 90.83, "grounded_synthesis_stateful": 239.44, "inconsistent_api_recovery_stateful": 184.56}, "scenarioSpeedN": {"relevance_detection": 25, "argument_fidelity": 39, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 46, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 39, "argument_transformation": 21, "grounded_synthesis": 31, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 36, "tool_selection_stateful": 5, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 50}}, {"label": "Meta-Llama-3.1-8B-Instruct.Q8_0 LF/P [reforged:full]", "model": "Meta-Llama-3.1-8B-Instruct.Q8_0", "backend": "llamafile", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 53.3, "accuracy": 58.9, "completeness": 90.5, "efficiency": 73.0, "wasted": 1.8, "speed": 5.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 92, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 86, "sequential_reasoning": 10, "error_recovery": 32, "data_gap_recovery": 28, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 48, "inconsistent_api_recovery": 52, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 92, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 90, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 22, "data_gap_recovery_stateful": 32, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 42}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 46, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 5, "error_recovery": 16, "data_gap_recovery": 14, "data_gap_recovery_extended": 4, "argument_transformation": 0, "grounded_synthesis": 24, "inconsistent_api_recovery": 26, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 46, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 11, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 21}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 18, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 1, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 18, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 1, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 138, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 172, "sequential_reasoning": 20, "error_recovery": 32, "data_gap_recovery": 70, "data_gap_recovery_extended": 32, "argument_transformation": 0, "grounded_synthesis": 240, "inconsistent_api_recovery": 208, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 138, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 180, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 33, "data_gap_recovery_stateful": 80, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 200, "inconsistent_api_recovery_stateful": 168}, "scenarioActualCalls": {"relevance_detection": 52, "argument_fidelity": 148, "tool_selection": 191, "basic_2step": 100, "sequential_3step": 205, "conditional_routing": 231, "sequential_reasoning": 16, "error_recovery": 76, "data_gap_recovery": 64, "data_gap_recovery_extended": 34, "argument_transformation": 0, "grounded_synthesis": 393, "inconsistent_api_recovery": 366, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 148, "tool_selection_stateful": 191, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 245, "sequential_reasoning_stateful": 5, "error_recovery_stateful": 55, "data_gap_recovery_stateful": 76, "data_gap_recovery_extended_stateful": 41, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 343, "inconsistent_api_recovery_stateful": 302}, "scenarioWastedSum": {"relevance_detection": 2.0, "argument_fidelity": 10.0, "tool_selection": 43.0, "basic_2step": 0.0, "sequential_3step": 55.0, "conditional_routing": 74.0, "sequential_reasoning": 26.0, "error_recovery": 207.0, "data_gap_recovery": 30.0, "data_gap_recovery_extended": 40.0, "argument_transformation": 81.0, "grounded_synthesis": 292.0, "inconsistent_api_recovery": 291.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 11.0, "tool_selection_stateful": 41.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 10.0, "conditional_routing_stateful": 75.0, "sequential_reasoning_stateful": 37.0, "error_recovery_stateful": 113.0, "data_gap_recovery_stateful": 23.0, "data_gap_recovery_extended_stateful": 28.0, "argument_transformation_stateful": 66.0, "grounded_synthesis_stateful": 289.0, "inconsistent_api_recovery_stateful": 300.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 18, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 1, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 26.34, "argument_fidelity": 99.29, "tool_selection": 119.4, "basic_2step": 50.54, "sequential_3step": 135.04, "conditional_routing": 199.34, "sequential_reasoning": 128.63, "error_recovery": 133.24, "data_gap_recovery": 229.53, "data_gap_recovery_extended": 349.93, "argument_transformation": 331.24, "grounded_synthesis": 763.37, "inconsistent_api_recovery": 651.1, "relevance_detection_stateful": 25.27, "argument_fidelity_stateful": 98.38, "tool_selection_stateful": 121.62, "basic_2step_stateful": 55.07, "sequential_3step_stateful": 9.64, "conditional_routing_stateful": 194.99, "sequential_reasoning_stateful": 143.1, "error_recovery_stateful": 118.16, "data_gap_recovery_stateful": 215.47, "data_gap_recovery_extended_stateful": 316.85, "argument_transformation_stateful": 192.9, "grounded_synthesis_stateful": 779.1, "inconsistent_api_recovery_stateful": 641.26}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 18, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 1, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 48, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q4_K_M LS/P [reforged:full]", "model": "Meta-Llama-3.1-8B-Instruct-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 52.8, "accuracy": 59.5, "completeness": 88.6, "efficiency": 75.2, "wasted": 1.9, "speed": 1.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 96, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 90, "sequential_reasoning": 10, "error_recovery": 28, "data_gap_recovery": 36, "data_gap_recovery_extended": 22, "argument_transformation": 0, "grounded_synthesis": 30, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 96, "tool_selection_stateful": 98, "basic_2step_stateful": 100, "sequential_3step_stateful": 4, "conditional_routing_stateful": 84, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 30, "data_gap_recovery_stateful": 24, "data_gap_recovery_extended_stateful": 22, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 30}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 48, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 45, "sequential_reasoning": 5, "error_recovery": 14, "data_gap_recovery": 18, "data_gap_recovery_extended": 11, "argument_transformation": 0, "grounded_synthesis": 15, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 48, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 2, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 15, "data_gap_recovery_stateful": 12, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 15}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 7, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 7, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 144, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 180, "sequential_reasoning": 20, "error_recovery": 28, "data_gap_recovery": 90, "data_gap_recovery_extended": 88, "argument_transformation": 0, "grounded_synthesis": 150, "inconsistent_api_recovery": 192, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 144, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 6, "conditional_routing_stateful": 168, "sequential_reasoning_stateful": 8, "error_recovery_stateful": 45, "data_gap_recovery_stateful": 60, "data_gap_recovery_extended_stateful": 88, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 120}, "scenarioActualCalls": {"relevance_detection": 66, "argument_fidelity": 149, "tool_selection": 130, "basic_2step": 100, "sequential_3step": 199, "conditional_routing": 257, "sequential_reasoning": 17, "error_recovery": 112, "data_gap_recovery": 84, "data_gap_recovery_extended": 67, "argument_transformation": 0, "grounded_synthesis": 268, "inconsistent_api_recovery": 349, "relevance_detection_stateful": 67, "argument_fidelity_stateful": 151, "tool_selection_stateful": 164, "basic_2step_stateful": 102, "sequential_3step_stateful": 6, "conditional_routing_stateful": 240, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 87, "data_gap_recovery_stateful": 63, "data_gap_recovery_extended_stateful": 82, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 175, "inconsistent_api_recovery_stateful": 218}, "scenarioWastedSum": {"relevance_detection": 16.0, "argument_fidelity": 5.0, "tool_selection": 9.0, "basic_2step": 0.0, "sequential_3step": 49.0, "conditional_routing": 85.0, "sequential_reasoning": 38.0, "error_recovery": 266.0, "data_gap_recovery": 40.0, "data_gap_recovery_extended": 13.0, "argument_transformation": 13.0, "grounded_synthesis": 302.0, "inconsistent_api_recovery": 307.0, "relevance_detection_stateful": 17.0, "argument_fidelity_stateful": 7.0, "tool_selection_stateful": 20.0, "basic_2step_stateful": 2.0, "sequential_3step_stateful": 47.0, "conditional_routing_stateful": 92.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 129.0, "data_gap_recovery_stateful": 42.0, "data_gap_recovery_extended_stateful": 22.0, "argument_transformation_stateful": 19.0, "grounded_synthesis_stateful": 294.0, "inconsistent_api_recovery_stateful": 334.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 7, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 12.42, "argument_fidelity": 33.08, "tool_selection": 33.26, "basic_2step": 18.04, "sequential_3step": 45.95, "conditional_routing": 82.89, "sequential_reasoning": 49.08, "error_recovery": 53.31, "data_gap_recovery": 99.74, "data_gap_recovery_extended": 125.28, "argument_transformation": 40.49, "grounded_synthesis": 287.87, "inconsistent_api_recovery": 246.95, "relevance_detection_stateful": 12.5, "argument_fidelity_stateful": 34.3, "tool_selection_stateful": 45.23, "basic_2step_stateful": 20.4, "sequential_3step_stateful": 12.04, "conditional_routing_stateful": 89.67, "sequential_reasoning_stateful": 53.14, "error_recovery_stateful": 42.68, "data_gap_recovery_stateful": 94.45, "data_gap_recovery_extended_stateful": 137.76, "argument_transformation_stateful": 25.31, "grounded_synthesis_stateful": 284.96, "inconsistent_api_recovery_stateful": 248.82}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 4, "grounded_synthesis": 45, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 7, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q4_K_M LS/N [bare]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "none", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 53.2, "accuracy": 64.9, "completeness": 82.1, "efficiency": 100.0, "wasted": 0.1, "speed": 15.0, "n": 50, "scenarios": {"relevance_detection": 92, "argument_fidelity": 100, "tool_selection": 4, "basic_2step": 86, "sequential_3step": 92, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 62, "data_gap_recovery_extended": 2, "argument_transformation": 32, "grounded_synthesis": 26, "inconsistent_api_recovery": 12, "relevance_detection_stateful": 86, "argument_fidelity_stateful": 100, "tool_selection_stateful": 6, "basic_2step_stateful": 96, "sequential_3step_stateful": 90, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 76, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 22, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 43, "sequential_3step": 46, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 31, "data_gap_recovery_extended": 1, "argument_transformation": 16, "grounded_synthesis": 13, "inconsistent_api_recovery": 6, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 48, "sequential_3step_stateful": 45, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 38, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 11, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 43, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 48, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 43, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 48, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 46, "argument_fidelity": 150, "tool_selection": 6, "basic_2step": 86, "sequential_3step": 138, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 155, "data_gap_recovery_extended": 8, "argument_transformation": 80, "grounded_synthesis": 130, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 150, "tool_selection_stateful": 9, "basic_2step_stateful": 96, "sequential_3step_stateful": 135, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 190, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 110, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 46, "argument_fidelity": 150, "tool_selection": 6, "basic_2step": 86, "sequential_3step": 138, "conditional_routing": 249, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 139, "data_gap_recovery_extended": 5, "argument_transformation": 63, "grounded_synthesis": 73, "inconsistent_api_recovery": 53, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 150, "tool_selection_stateful": 9, "basic_2step_stateful": 96, "sequential_3step_stateful": 135, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 166, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 67, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 49.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 5.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 12.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 5.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 2.0, "inconsistent_api_recovery_stateful": 2.0}, "scenarioWastedN": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 43, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 48, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 50.7, "argument_fidelity": 262.67, "tool_selection": 11.48, "basic_2step": 124.3, "sequential_3step": 366.04, "conditional_routing": 1046.54, "sequential_reasoning": 539.87, "error_recovery": 0.0, "data_gap_recovery": 606.08, "data_gap_recovery_extended": 895.61, "argument_transformation": 2006.61, "grounded_synthesis": 1429.78, "inconsistent_api_recovery": 879.59, "relevance_detection_stateful": 46.43, "argument_fidelity_stateful": 271.98, "tool_selection_stateful": 18.89, "basic_2step_stateful": 129.24, "sequential_3step_stateful": 337.67, "conditional_routing_stateful": 1046.4, "sequential_reasoning_stateful": 490.25, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 607.89, "data_gap_recovery_extended_stateful": 876.85, "argument_transformation_stateful": 1755.44, "grounded_synthesis_stateful": 1490.61, "inconsistent_api_recovery_stateful": 696.83}, "scenarioSpeedN": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 43, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 49, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 48, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-Nemo-Instruct-2407-Q4_K_M LS/P [bare:full]", "model": "Mistral-Nemo-Instruct-2407-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 53.1, "accuracy": 66.0, "completeness": 80.5, "efficiency": 99.8, "wasted": 0.5, "speed": 2.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 78, "basic_2step": 100, "sequential_3step": 0, "conditional_routing": 100, "sequential_reasoning": 62, "error_recovery": 0, "data_gap_recovery": 94, "data_gap_recovery_extended": 76, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 82, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 34, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 90, "data_gap_recovery_extended_stateful": 64, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 31, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 38, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 41, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 17, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 37, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 41, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 30, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 37, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 41, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 30, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 117, "basic_2step": 100, "sequential_3step": 0, "conditional_routing": 200, "sequential_reasoning": 124, "error_recovery": 0, "data_gap_recovery": 235, "data_gap_recovery_extended": 304, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 123, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 68, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 256, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 159, "tool_selection": 117, "basic_2step": 100, "sequential_3step": 0, "conditional_routing": 249, "sequential_reasoning": 316, "error_recovery": 0, "data_gap_recovery": 187, "data_gap_recovery_extended": 133, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 159, "tool_selection_stateful": 123, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 166, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 184, "data_gap_recovery_extended_stateful": 114, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 9.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 49.0, "sequential_reasoning": 221.0, "error_recovery": 0.0, "data_gap_recovery": 8.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 19.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 9.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 164.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 16.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 11.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 37, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 41, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 30, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 18.45, "argument_fidelity": 48.01, "tool_selection": 47.61, "basic_2step": 30.5, "sequential_3step": 0.0, "conditional_routing": 131.75, "sequential_reasoning": 119.16, "error_recovery": 0.0, "data_gap_recovery": 157.53, "data_gap_recovery_extended": 198.56, "argument_transformation": 225.07, "grounded_synthesis": 189.47, "inconsistent_api_recovery": 174.53, "relevance_detection_stateful": 18.76, "argument_fidelity_stateful": 48.29, "tool_selection_stateful": 51.13, "basic_2step_stateful": 33.46, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 128.13, "sequential_reasoning_stateful": 90.94, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 156.64, "data_gap_recovery_extended_stateful": 192.9, "argument_transformation_stateful": 223.09, "grounded_synthesis_stateful": 194.53, "inconsistent_api_recovery_stateful": 183.68}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 39, "basic_2step": 50, "sequential_3step": 0, "conditional_routing": 50, "sequential_reasoning": 37, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 41, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 30, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.6-35B-A3B-UD-Q4_K_M LS/N [bare:full]", "model": "Qwen3.6-35B-A3B-UD-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "qwen3.6-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 52.8, "accuracy": 88.4, "completeness": 59.8, "efficiency": 100.0, "wasted": 0.1, "speed": 11.8, "n": 50, "scenarios": {"relevance_detection": 14, "argument_fidelity": 72, "tool_selection": 2, "basic_2step": 92, "sequential_3step": 68, "conditional_routing": 92, "sequential_reasoning": 42, "error_recovery": 0, "data_gap_recovery": 98, "data_gap_recovery_extended": 58, "argument_transformation": 66, "grounded_synthesis": 52, "inconsistent_api_recovery": 90, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 80, "tool_selection_stateful": 4, "basic_2step_stateful": 86, "sequential_3step_stateful": 56, "conditional_routing_stateful": 90, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 90, "data_gap_recovery_extended_stateful": 70, "argument_transformation_stateful": 56, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 7, "argument_fidelity": 36, "tool_selection": 1, "basic_2step": 46, "sequential_3step": 34, "conditional_routing": 46, "sequential_reasoning": 21, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 29, "argument_transformation": 33, "grounded_synthesis": 26, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 40, "tool_selection_stateful": 2, "basic_2step_stateful": 43, "sequential_3step_stateful": 28, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 35, "argument_transformation_stateful": 28, "grounded_synthesis_stateful": 23, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 7, "argument_fidelity": 36, "tool_selection": 1, "basic_2step": 46, "sequential_3step": 34, "conditional_routing": 46, "sequential_reasoning": 23, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 45, "argument_transformation": 39, "grounded_synthesis": 28, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 40, "tool_selection_stateful": 2, "basic_2step_stateful": 43, "sequential_3step_stateful": 28, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 42}, "scenarioValidated": {"relevance_detection": 7, "argument_fidelity": 36, "tool_selection": 1, "basic_2step": 46, "sequential_3step": 34, "conditional_routing": 46, "sequential_reasoning": 23, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 45, "argument_transformation": 39, "grounded_synthesis": 28, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 40, "tool_selection_stateful": 2, "basic_2step_stateful": 43, "sequential_3step_stateful": 28, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 42}, "scenarioIdealCalls": {"relevance_detection": 7, "argument_fidelity": 108, "tool_selection": 3, "basic_2step": 92, "sequential_3step": 102, "conditional_routing": 184, "sequential_reasoning": 84, "error_recovery": 0, "data_gap_recovery": 245, "data_gap_recovery_extended": 232, "argument_transformation": 165, "grounded_synthesis": 260, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 120, "tool_selection_stateful": 6, "basic_2step_stateful": 86, "sequential_3step_stateful": 84, "conditional_routing_stateful": 180, "sequential_reasoning_stateful": 80, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 280, "argument_transformation_stateful": 140, "grounded_synthesis_stateful": 230, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 7, "argument_fidelity": 108, "tool_selection": 3, "basic_2step": 92, "sequential_3step": 102, "conditional_routing": 134, "sequential_reasoning": 84, "error_recovery": 0, "data_gap_recovery": 169, "data_gap_recovery_extended": 123, "argument_transformation": 132, "grounded_synthesis": 196, "inconsistent_api_recovery": 240, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 120, "tool_selection_stateful": 6, "basic_2step_stateful": 86, "sequential_3step_stateful": 84, "conditional_routing_stateful": 121, "sequential_reasoning_stateful": 80, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 151, "data_gap_recovery_extended_stateful": 132, "argument_transformation_stateful": 117, "grounded_synthesis_stateful": 142, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 9.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 49.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 6.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 32.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 7, "argument_fidelity": 36, "tool_selection": 1, "basic_2step": 46, "sequential_3step": 34, "conditional_routing": 46, "sequential_reasoning": 23, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 45, "argument_transformation": 39, "grounded_synthesis": 28, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 40, "tool_selection_stateful": 2, "basic_2step_stateful": 43, "sequential_3step_stateful": 28, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 42}, "scenarioSpeedSum": {"relevance_detection": 47.07, "argument_fidelity": 126.97, "tool_selection": 2.92, "basic_2step": 139.07, "sequential_3step": 142.39, "conditional_routing": 443.97, "sequential_reasoning": 148.26, "error_recovery": 0.0, "data_gap_recovery": 490.41, "data_gap_recovery_extended": 480.18, "argument_transformation": 1264.45, "grounded_synthesis": 807.42, "inconsistent_api_recovery": 725.27, "relevance_detection_stateful": 32.83, "argument_fidelity_stateful": 127.91, "tool_selection_stateful": 5.59, "basic_2step_stateful": 122.27, "sequential_3step_stateful": 127.44, "conditional_routing_stateful": 446.13, "sequential_reasoning_stateful": 108.78, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 438.8, "data_gap_recovery_extended_stateful": 483.18, "argument_transformation_stateful": 1081.75, "grounded_synthesis_stateful": 723.53, "inconsistent_api_recovery_stateful": 681.41}, "scenarioSpeedN": {"relevance_detection": 7, "argument_fidelity": 36, "tool_selection": 1, "basic_2step": 46, "sequential_3step": 34, "conditional_routing": 46, "sequential_reasoning": 23, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 45, "argument_transformation": 39, "grounded_synthesis": 28, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 40, "tool_selection_stateful": 2, "basic_2step_stateful": 43, "sequential_3step_stateful": 28, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 27, "inconsistent_api_recovery_stateful": 42}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q4_K_M LS/N [reforged:full]", "model": "Meta-Llama-3.1-8B-Instruct-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 51.8, "accuracy": 52.8, "completeness": 98.0, "efficiency": 75.1, "wasted": 1.6, "speed": 1.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 92, "basic_2step": 100, "sequential_3step": 80, "conditional_routing": 72, "sequential_reasoning": 66, "error_recovery": 76, "data_gap_recovery": 20, "data_gap_recovery_extended": 2, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 94, "basic_2step_stateful": 98, "sequential_3step_stateful": 4, "conditional_routing_stateful": 72, "sequential_reasoning_stateful": 54, "error_recovery_stateful": 80, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 40, "conditional_routing": 36, "sequential_reasoning": 33, "error_recovery": 38, "data_gap_recovery": 10, "data_gap_recovery_extended": 1, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 49, "sequential_3step_stateful": 2, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 27, "error_recovery_stateful": 40, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 138, "basic_2step": 100, "sequential_3step": 120, "conditional_routing": 144, "sequential_reasoning": 132, "error_recovery": 76, "data_gap_recovery": 50, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 64, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 141, "basic_2step_stateful": 98, "sequential_3step_stateful": 6, "conditional_routing_stateful": 144, "sequential_reasoning_stateful": 108, "error_recovery_stateful": 120, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 151, "tool_selection": 193, "basic_2step": 100, "sequential_3step": 298, "conditional_routing": 191, "sequential_reasoning": 174, "error_recovery": 114, "data_gap_recovery": 60, "data_gap_recovery_extended": 11, "argument_transformation": 0, "grounded_synthesis": 15, "inconsistent_api_recovery": 131, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 219, "basic_2step_stateful": 98, "sequential_3step_stateful": 17, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 135, "error_recovery_stateful": 120, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 17, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 32}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 1.0, "tool_selection": 71.0, "basic_2step": 0.0, "sequential_3step": 235.0, "conditional_routing": 56.0, "sequential_reasoning": 86.0, "error_recovery": 50.0, "data_gap_recovery": 14.0, "data_gap_recovery_extended": 32.0, "argument_transformation": 33.0, "grounded_synthesis": 177.0, "inconsistent_api_recovery": 263.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 86.0, "basic_2step_stateful": 1.0, "sequential_3step_stateful": 171.0, "conditional_routing_stateful": 64.0, "sequential_reasoning_stateful": 63.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 14.0, "data_gap_recovery_extended_stateful": 46.0, "argument_transformation_stateful": 78.0, "grounded_synthesis_stateful": 193.0, "inconsistent_api_recovery_stateful": 296.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}, "scenarioSpeedSum": {"relevance_detection": 10.73, "argument_fidelity": 43.08, "tool_selection": 48.31, "basic_2step": 18.13, "sequential_3step": 84.39, "conditional_routing": 95.7, "sequential_reasoning": 74.93, "error_recovery": 27.56, "data_gap_recovery": 67.6, "data_gap_recovery_extended": 88.58, "argument_transformation": 65.29, "grounded_synthesis": 189.11, "inconsistent_api_recovery": 223.29, "relevance_detection_stateful": 11.09, "argument_fidelity_stateful": 42.41, "tool_selection_stateful": 58.0, "basic_2step_stateful": 20.61, "sequential_3step_stateful": 62.64, "conditional_routing_stateful": 102.69, "sequential_reasoning_stateful": 103.33, "error_recovery_stateful": 27.64, "data_gap_recovery_stateful": 62.42, "data_gap_recovery_extended_stateful": 98.9, "argument_transformation_stateful": 83.69, "grounded_synthesis_stateful": 188.03, "inconsistent_api_recovery_stateful": 248.23}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 44, "grounded_synthesis": 50, "inconsistent_api_recovery": 42, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 47}}, {"label": "Meta-Llama-3.1-8B-Instruct.Q4_K_M LF/P [reforged:full]", "model": "Meta-Llama-3.1-8B-Instruct.Q4_K_M", "backend": "llamafile", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 51.2, "accuracy": 57.2, "completeness": 89.5, "efficiency": 76.3, "wasted": 1.8, "speed": 3.5, "n": 50, "scenarios": {"relevance_detection": 98, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 86, "sequential_reasoning": 10, "error_recovery": 24, "data_gap_recovery": 18, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 32, "inconsistent_api_recovery": 56, "relevance_detection_stateful": 98, "argument_fidelity_stateful": 98, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 26, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 40}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 5, "error_recovery": 12, "data_gap_recovery": 9, "data_gap_recovery_extended": 4, "argument_transformation": 0, "grounded_synthesis": 16, "inconsistent_api_recovery": 28, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 13, "data_gap_recovery_stateful": 8, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 20}, "scenarioCompleted": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 15, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 13, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 9, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 15, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 13, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 9, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 49, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 172, "sequential_reasoning": 20, "error_recovery": 24, "data_gap_recovery": 45, "data_gap_recovery_extended": 32, "argument_transformation": 0, "grounded_synthesis": 160, "inconsistent_api_recovery": 224, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 147, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 39, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 80, "inconsistent_api_recovery_stateful": 160}, "scenarioActualCalls": {"relevance_detection": 52, "argument_fidelity": 150, "tool_selection": 168, "basic_2step": 100, "sequential_3step": 176, "conditional_routing": 208, "sequential_reasoning": 25, "error_recovery": 52, "data_gap_recovery": 63, "data_gap_recovery_extended": 32, "argument_transformation": 0, "grounded_synthesis": 274, "inconsistent_api_recovery": 406, "relevance_detection_stateful": 54, "argument_fidelity_stateful": 150, "tool_selection_stateful": 171, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 244, "sequential_reasoning_stateful": 5, "error_recovery_stateful": 56, "data_gap_recovery_stateful": 35, "data_gap_recovery_extended_stateful": 37, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 128, "inconsistent_api_recovery_stateful": 288}, "scenarioWastedSum": {"relevance_detection": 3.0, "argument_fidelity": 0.0, "tool_selection": 24.0, "basic_2step": 0.0, "sequential_3step": 26.0, "conditional_routing": 57.0, "sequential_reasoning": 41.0, "error_recovery": 118.0, "data_gap_recovery": 69.0, "data_gap_recovery_extended": 43.0, "argument_transformation": 102.0, "grounded_synthesis": 238.0, "inconsistent_api_recovery": 321.0, "relevance_detection_stateful": 5.0, "argument_fidelity_stateful": 3.0, "tool_selection_stateful": 21.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 90.0, "conditional_routing_stateful": 60.0, "sequential_reasoning_stateful": 67.0, "error_recovery_stateful": 73.0, "data_gap_recovery_stateful": 41.0, "data_gap_recovery_extended_stateful": 45.0, "argument_transformation_stateful": 42.0, "grounded_synthesis_stateful": 284.0, "inconsistent_api_recovery_stateful": 299.0}, "scenarioWastedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 15, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 13, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 9, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 21.52, "argument_fidelity": 64.5, "tool_selection": 75.96, "basic_2step": 38.86, "sequential_3step": 82.83, "conditional_routing": 135.54, "sequential_reasoning": 94.03, "error_recovery": 75.85, "data_gap_recovery": 160.55, "data_gap_recovery_extended": 202.33, "argument_transformation": 191.79, "grounded_synthesis": 426.77, "inconsistent_api_recovery": 474.59, "relevance_detection_stateful": 22.48, "argument_fidelity_stateful": 68.23, "tool_selection_stateful": 81.61, "basic_2step_stateful": 43.49, "sequential_3step_stateful": 46.1, "conditional_routing_stateful": 139.46, "sequential_reasoning_stateful": 103.96, "error_recovery_stateful": 76.47, "data_gap_recovery_stateful": 128.97, "data_gap_recovery_extended_stateful": 202.33, "argument_transformation_stateful": 110.03, "grounded_synthesis_stateful": 499.22, "inconsistent_api_recovery_stateful": 474.31}, "scenarioSpeedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 15, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 49, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 13, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 9, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q8_0 LS/N [bare]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "none", "family": "qwen3-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 50.4, "accuracy": 63.4, "completeness": 79.5, "efficiency": 100.0, "wasted": 0.1, "speed": 23.1, "n": 50, "scenarios": {"relevance_detection": 88, "argument_fidelity": 100, "tool_selection": 2, "basic_2step": 62, "sequential_3step": 98, "conditional_routing": 100, "sequential_reasoning": 98, "error_recovery": 0, "data_gap_recovery": 70, "data_gap_recovery_extended": 2, "argument_transformation": 22, "grounded_synthesis": 40, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 92, "argument_fidelity_stateful": 100, "tool_selection_stateful": 4, "basic_2step_stateful": 46, "sequential_3step_stateful": 88, "conditional_routing_stateful": 94, "sequential_reasoning_stateful": 98, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 70, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 44, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 31, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 0, "data_gap_recovery": 35, "data_gap_recovery_extended": 1, "argument_transformation": 11, "grounded_synthesis": 20, "inconsistent_api_recovery": 1, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 23, "sequential_3step_stateful": 44, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 35, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 14, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 44, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 31, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 23, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 44, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 31, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 23, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 44, "argument_fidelity": 150, "tool_selection": 3, "basic_2step": 62, "sequential_3step": 147, "conditional_routing": 200, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 175, "data_gap_recovery_extended": 8, "argument_transformation": 55, "grounded_synthesis": 200, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 150, "tool_selection_stateful": 6, "basic_2step_stateful": 46, "sequential_3step_stateful": 132, "conditional_routing_stateful": 188, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 175, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 140, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 44, "argument_fidelity": 150, "tool_selection": 3, "basic_2step": 62, "sequential_3step": 147, "conditional_routing": 246, "sequential_reasoning": 196, "error_recovery": 0, "data_gap_recovery": 157, "data_gap_recovery_extended": 8, "argument_transformation": 42, "grounded_synthesis": 105, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 150, "tool_selection_stateful": 6, "basic_2step_stateful": 46, "sequential_3step_stateful": 132, "conditional_routing_stateful": 235, "sequential_reasoning_stateful": 196, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 165, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 80, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 47.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 3.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 3.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 47.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 11.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 1.0, "inconsistent_api_recovery_stateful": 2.0}, "scenarioWastedN": {"relevance_detection": 44, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 31, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 23, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 165.11, "argument_fidelity": 455.31, "tool_selection": 8.93, "basic_2step": 150.75, "sequential_3step": 702.52, "conditional_routing": 1382.06, "sequential_reasoning": 648.61, "error_recovery": 0.0, "data_gap_recovery": 950.99, "data_gap_recovery_extended": 1305.67, "argument_transformation": 2843.26, "grounded_synthesis": 2223.63, "inconsistent_api_recovery": 1051.91, "relevance_detection_stateful": 168.85, "argument_fidelity_stateful": 434.01, "tool_selection_stateful": 14.02, "basic_2step_stateful": 104.99, "sequential_3step_stateful": 672.81, "conditional_routing_stateful": 1415.17, "sequential_reasoning_stateful": 738.25, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1011.04, "data_gap_recovery_extended_stateful": 1188.52, "argument_transformation_stateful": 2856.7, "grounded_synthesis_stateful": 2118.86, "inconsistent_api_recovery_stateful": 1264.19}, "scenarioSpeedN": {"relevance_detection": 44, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 31, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 2, "basic_2step_stateful": 23, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q8_0 LS/N [bare:keep-last]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "keep-last", "family": "qwen3-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 49.7, "accuracy": 65.4, "completeness": 76.0, "efficiency": 100.0, "wasted": 0.1, "speed": 21.6, "n": 50, "scenarios": {"relevance_detection": 86, "argument_fidelity": 78, "tool_selection": 2, "basic_2step": 88, "sequential_3step": 88, "conditional_routing": 100, "sequential_reasoning": 80, "error_recovery": 0, "data_gap_recovery": 64, "data_gap_recovery_extended": 2, "argument_transformation": 18, "grounded_synthesis": 18, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 90, "argument_fidelity_stateful": 88, "tool_selection_stateful": 2, "basic_2step_stateful": 84, "sequential_3step_stateful": 94, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 86, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 92, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 43, "argument_fidelity": 39, "tool_selection": 1, "basic_2step": 44, "sequential_3step": 44, "conditional_routing": 50, "sequential_reasoning": 40, "error_recovery": 0, "data_gap_recovery": 32, "data_gap_recovery_extended": 1, "argument_transformation": 9, "grounded_synthesis": 9, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 44, "tool_selection_stateful": 1, "basic_2step_stateful": 42, "sequential_3step_stateful": 47, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 43, "argument_fidelity": 39, "tool_selection": 1, "basic_2step": 44, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 40, "error_recovery": 0, "data_gap_recovery": 38, "data_gap_recovery_extended": 47, "argument_transformation": 41, "grounded_synthesis": 43, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 45, "tool_selection_stateful": 1, "basic_2step_stateful": 42, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 43, "argument_fidelity": 39, "tool_selection": 1, "basic_2step": 44, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 40, "error_recovery": 0, "data_gap_recovery": 38, "data_gap_recovery_extended": 47, "argument_transformation": 41, "grounded_synthesis": 43, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 45, "tool_selection_stateful": 1, "basic_2step_stateful": 42, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 43, "argument_fidelity": 117, "tool_selection": 3, "basic_2step": 88, "sequential_3step": 132, "conditional_routing": 200, "sequential_reasoning": 160, "error_recovery": 0, "data_gap_recovery": 160, "data_gap_recovery_extended": 8, "argument_transformation": 45, "grounded_synthesis": 90, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 132, "tool_selection_stateful": 3, "basic_2step_stateful": 84, "sequential_3step_stateful": 141, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 172, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 230, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 80, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 43, "argument_fidelity": 117, "tool_selection": 3, "basic_2step": 88, "sequential_3step": 132, "conditional_routing": 247, "sequential_reasoning": 160, "error_recovery": 0, "data_gap_recovery": 133, "data_gap_recovery_extended": 5, "argument_transformation": 35, "grounded_synthesis": 47, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 132, "tool_selection_stateful": 3, "basic_2step_stateful": 84, "sequential_3step_stateful": 141, "conditional_routing_stateful": 237, "sequential_reasoning_stateful": 172, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 206, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 27, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 47.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 1.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 45.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 13.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 10.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 43, "argument_fidelity": 39, "tool_selection": 1, "basic_2step": 44, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 40, "error_recovery": 0, "data_gap_recovery": 38, "data_gap_recovery_extended": 47, "argument_transformation": 41, "grounded_synthesis": 43, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 45, "tool_selection_stateful": 1, "basic_2step_stateful": 42, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 183.46, "argument_fidelity": 377.41, "tool_selection": 6.91, "basic_2step": 210.97, "sequential_3step": 727.25, "conditional_routing": 1279.15, "sequential_reasoning": 682.73, "error_recovery": 0.0, "data_gap_recovery": 821.26, "data_gap_recovery_extended": 1224.88, "argument_transformation": 2118.91, "grounded_synthesis": 1675.69, "inconsistent_api_recovery": 1043.8, "relevance_detection_stateful": 144.06, "argument_fidelity_stateful": 405.96, "tool_selection_stateful": 6.58, "basic_2step_stateful": 232.07, "sequential_3step_stateful": 788.92, "conditional_routing_stateful": 1292.05, "sequential_reasoning_stateful": 674.22, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1055.36, "data_gap_recovery_extended_stateful": 1353.97, "argument_transformation_stateful": 2312.39, "grounded_synthesis_stateful": 1527.24, "inconsistent_api_recovery_stateful": 1207.53}, "scenarioSpeedN": {"relevance_detection": 43, "argument_fidelity": 39, "tool_selection": 1, "basic_2step": 44, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 40, "error_recovery": 0, "data_gap_recovery": 38, "data_gap_recovery_extended": 47, "argument_transformation": 41, "grounded_synthesis": 43, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 45, "tool_selection_stateful": 1, "basic_2step_stateful": 42, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Instruct-2512-Q8_0 LS/P [bare]", "model": "Ministral-3-8B-Instruct-2512-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "none", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 50.2, "accuracy": 83.0, "completeness": 60.5, "efficiency": 100.0, "wasted": 0.2, "speed": 2.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 98, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 82, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 0, "inconsistent_api_recovery": 78, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 94, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 90, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 54}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 0, "inconsistent_api_recovery": 39, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 27}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 205, "data_gap_recovery_extended": 0, "argument_transformation": 20, "grounded_synthesis": 0, "inconsistent_api_recovery": 312, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 141, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 225, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 216}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 147, "conditional_routing": 250, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 123, "data_gap_recovery_extended": 0, "argument_transformation": 16, "grounded_synthesis": 0, "inconsistent_api_recovery": 317, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 141, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 135, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 164}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 27.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 23.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 17.04, "argument_fidelity": 65.84, "tool_selection": 0.0, "basic_2step": 29.56, "sequential_3step": 87.98, "conditional_routing": 154.27, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 157.81, "data_gap_recovery_extended": 0.0, "argument_transformation": 300.27, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 239.0, "relevance_detection_stateful": 17.05, "argument_fidelity_stateful": 65.15, "tool_selection_stateful": 0.0, "basic_2step_stateful": 33.04, "sequential_3step_stateful": 89.65, "conditional_routing_stateful": 151.29, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 174.92, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 280.66, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 241.52}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q8_0 LS/N [reforged:full]", "model": "Meta-Llama-3.1-8B-Instruct-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 48.8, "accuracy": 51.8, "completeness": 94.2, "efficiency": 71.2, "wasted": 1.8, "speed": 2.5, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 86, "sequential_3step": 92, "conditional_routing": 64, "sequential_reasoning": 68, "error_recovery": 38, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 30, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 94, "basic_2step_stateful": 92, "sequential_3step_stateful": 16, "conditional_routing_stateful": 52, "sequential_reasoning_stateful": 36, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 14, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 38}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 43, "sequential_3step": 46, "conditional_routing": 32, "sequential_reasoning": 34, "error_recovery": 19, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 15, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 46, "sequential_3step_stateful": 8, "conditional_routing_stateful": 26, "sequential_reasoning_stateful": 18, "error_recovery_stateful": 21, "data_gap_recovery_stateful": 7, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 19}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 50, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 39, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 50, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 39, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 86, "sequential_3step": 138, "conditional_routing": 128, "sequential_reasoning": 136, "error_recovery": 38, "data_gap_recovery": 10, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 120, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 141, "basic_2step_stateful": 92, "sequential_3step_stateful": 24, "conditional_routing_stateful": 104, "sequential_reasoning_stateful": 72, "error_recovery_stateful": 63, "data_gap_recovery_stateful": 35, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 152}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 151, "tool_selection": 185, "basic_2step": 90, "sequential_3step": 326, "conditional_routing": 170, "sequential_reasoning": 216, "error_recovery": 58, "data_gap_recovery": 8, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 245, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 152, "tool_selection_stateful": 159, "basic_2step_stateful": 95, "sequential_3step_stateful": 47, "conditional_routing_stateful": 142, "sequential_reasoning_stateful": 106, "error_recovery_stateful": 70, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 305}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 1.0, "tool_selection": 35.0, "basic_2step": 11.0, "sequential_3step": 197.0, "conditional_routing": 55.0, "sequential_reasoning": 89.0, "error_recovery": 54.0, "data_gap_recovery": 14.0, "data_gap_recovery_extended": 17.0, "argument_transformation": 188.0, "grounded_synthesis": 155.0, "inconsistent_api_recovery": 288.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 2.0, "tool_selection_stateful": 30.0, "basic_2step_stateful": 7.0, "sequential_3step_stateful": 239.0, "conditional_routing_stateful": 56.0, "sequential_reasoning_stateful": 52.0, "error_recovery_stateful": 10.0, "data_gap_recovery_stateful": 22.0, "data_gap_recovery_extended_stateful": 20.0, "argument_transformation_stateful": 156.0, "grounded_synthesis_stateful": 192.0, "inconsistent_api_recovery_stateful": 302.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 50, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 39, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}, "scenarioSpeedSum": {"relevance_detection": 15.33, "argument_fidelity": 64.2, "tool_selection": 67.45, "basic_2step": 30.5, "sequential_3step": 121.19, "conditional_routing": 128.53, "sequential_reasoning": 122.07, "error_recovery": 42.98, "data_gap_recovery": 101.41, "data_gap_recovery_extended": 118.43, "argument_transformation": 139.21, "grounded_synthesis": 261.56, "inconsistent_api_recovery": 318.68, "relevance_detection_stateful": 16.02, "argument_fidelity_stateful": 62.72, "tool_selection_stateful": 69.15, "basic_2step_stateful": 31.94, "sequential_3step_stateful": 110.39, "conditional_routing_stateful": 119.38, "sequential_reasoning_stateful": 97.4, "error_recovery_stateful": 43.83, "data_gap_recovery_stateful": 111.6, "data_gap_recovery_extended_stateful": 122.69, "argument_transformation_stateful": 123.85, "grounded_synthesis_stateful": 272.46, "inconsistent_api_recovery_stateful": 326.8}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 49, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 31, "grounded_synthesis": 50, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 39, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 40}}, {"label": "Qwen3-14B-Q4_K_M LS/N [bare]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "none", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 48.1, "accuracy": 60.0, "completeness": 80.2, "efficiency": 100.0, "wasted": 0.1, "speed": 21.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 96, "tool_selection": 0, "basic_2step": 38, "sequential_3step": 66, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 62, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 48, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 98, "tool_selection_stateful": 0, "basic_2step_stateful": 42, "sequential_3step_stateful": 56, "conditional_routing_stateful": 94, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 72, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 56, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 48, "tool_selection": 0, "basic_2step": 19, "sequential_3step": 33, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 31, "data_gap_recovery_extended": 4, "argument_transformation": 0, "grounded_synthesis": 24, "inconsistent_api_recovery": 1, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 0, "basic_2step_stateful": 21, "sequential_3step_stateful": 28, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 36, "data_gap_recovery_extended_stateful": 6, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 48, "tool_selection": 4, "basic_2step": 19, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 2, "basic_2step_stateful": 21, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 48, "tool_selection": 4, "basic_2step": 19, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 2, "basic_2step_stateful": 21, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 144, "tool_selection": 0, "basic_2step": 38, "sequential_3step": 99, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 155, "data_gap_recovery_extended": 32, "argument_transformation": 0, "grounded_synthesis": 240, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 0, "basic_2step_stateful": 42, "sequential_3step_stateful": 84, "conditional_routing_stateful": 188, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 180, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 280, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 144, "tool_selection": 0, "basic_2step": 32, "sequential_3step": 99, "conditional_routing": 174, "sequential_reasoning": 198, "error_recovery": 0, "data_gap_recovery": 121, "data_gap_recovery_extended": 16, "argument_transformation": 0, "grounded_synthesis": 133, "inconsistent_api_recovery": 9, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 0, "basic_2step_stateful": 37, "sequential_3step_stateful": 84, "conditional_routing_stateful": 175, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 143, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 155, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 16.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 17.0, "inconsistent_api_recovery": 1.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 21.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 10.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 48, "tool_selection": 4, "basic_2step": 19, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 2, "basic_2step_stateful": 21, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 90.94, "argument_fidelity": 319.53, "tool_selection": 10.32, "basic_2step": 61.37, "sequential_3step": 425.73, "conditional_routing": 929.99, "sequential_reasoning": 793.25, "error_recovery": 0.0, "data_gap_recovery": 709.46, "data_gap_recovery_extended": 1141.85, "argument_transformation": 3406.45, "grounded_synthesis": 2403.55, "inconsistent_api_recovery": 745.77, "relevance_detection_stateful": 90.61, "argument_fidelity_stateful": 327.2, "tool_selection_stateful": 5.13, "basic_2step_stateful": 70.29, "sequential_3step_stateful": 384.19, "conditional_routing_stateful": 1004.66, "sequential_reasoning_stateful": 858.31, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 734.16, "data_gap_recovery_extended_stateful": 1111.16, "argument_transformation_stateful": 3530.1, "grounded_synthesis_stateful": 2226.02, "inconsistent_api_recovery_stateful": 755.46}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 48, "tool_selection": 4, "basic_2step": 19, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 2, "basic_2step_stateful": 21, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "qwen3:8b-q8_0 OL/N [bare:full]", "model": "qwen3:8b-q8_0", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "qwen3-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 47.3, "accuracy": 56.8, "completeness": 83.3, "efficiency": 95.8, "wasted": 0.1, "speed": 24.0, "n": 50, "scenarios": {"relevance_detection": 86, "argument_fidelity": 100, "tool_selection": 2, "basic_2step": 34, "sequential_3step": 92, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 64, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 6, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 88, "argument_fidelity_stateful": 98, "tool_selection_stateful": 2, "basic_2step_stateful": 68, "sequential_3step_stateful": 92, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 84, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 43, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 17, "sequential_3step": 46, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 32, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 3, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 49, "tool_selection_stateful": 1, "basic_2step_stateful": 34, "sequential_3step_stateful": 46, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 43, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 17, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 39, "data_gap_recovery_extended": 49, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 49, "tool_selection_stateful": 1, "basic_2step_stateful": 35, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 43, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 17, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 39, "data_gap_recovery_extended": 49, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 49, "tool_selection_stateful": 1, "basic_2step_stateful": 35, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 43, "argument_fidelity": 150, "tool_selection": 3, "basic_2step": 34, "sequential_3step": 138, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 160, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 30, "inconsistent_api_recovery": 64, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 147, "tool_selection_stateful": 3, "basic_2step_stateful": 68, "sequential_3step_stateful": 138, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 210, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 43, "argument_fidelity": 150, "tool_selection": 3, "basic_2step": 34, "sequential_3step": 138, "conditional_routing": 243, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 153, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 23, "inconsistent_api_recovery": 76, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 147, "tool_selection_stateful": 3, "basic_2step_stateful": 68, "sequential_3step_stateful": 138, "conditional_routing_stateful": 245, "sequential_reasoning_stateful": 184, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 205, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 22}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 45.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 14.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 49.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 19.0}, "scenarioWastedN": {"relevance_detection": 43, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 17, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 39, "data_gap_recovery_extended": 49, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 49, "tool_selection_stateful": 1, "basic_2step_stateful": 35, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 134.13, "argument_fidelity": 379.45, "tool_selection": 8.78, "basic_2step": 73.93, "sequential_3step": 636.21, "conditional_routing": 1430.82, "sequential_reasoning": 512.68, "error_recovery": 175.78, "data_gap_recovery": 769.24, "data_gap_recovery_extended": 1745.51, "argument_transformation": 2307.53, "grounded_synthesis": 2670.9, "inconsistent_api_recovery": 1783.53, "relevance_detection_stateful": 135.46, "argument_fidelity_stateful": 407.11, "tool_selection_stateful": 9.18, "basic_2step_stateful": 195.02, "sequential_3step_stateful": 723.4, "conditional_routing_stateful": 1571.28, "sequential_reasoning_stateful": 552.39, "error_recovery_stateful": 144.63, "data_gap_recovery_stateful": 903.44, "data_gap_recovery_extended_stateful": 1803.24, "argument_transformation_stateful": 2614.36, "grounded_synthesis_stateful": 2433.41, "inconsistent_api_recovery_stateful": 1860.26}, "scenarioSpeedN": {"relevance_detection": 43, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 17, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 38, "data_gap_recovery": 39, "data_gap_recovery_extended": 49, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 49, "tool_selection_stateful": 1, "basic_2step_stateful": 35, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 47, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q8_0 LS/N [bare:full]", "model": "Qwen3-8B-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "qwen3-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 46.8, "accuracy": 63.0, "completeness": 74.4, "efficiency": 100.0, "wasted": 0.1, "speed": 20.9, "n": 50, "scenarios": {"relevance_detection": 84, "argument_fidelity": 70, "tool_selection": 0, "basic_2step": 84, "sequential_3step": 96, "conditional_routing": 96, "sequential_reasoning": 82, "error_recovery": 0, "data_gap_recovery": 62, "data_gap_recovery_extended": 0, "argument_transformation": 12, "grounded_synthesis": 14, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 88, "argument_fidelity_stateful": 70, "tool_selection_stateful": 2, "basic_2step_stateful": 98, "sequential_3step_stateful": 88, "conditional_routing_stateful": 84, "sequential_reasoning_stateful": 86, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 74, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 42, "argument_fidelity": 35, "tool_selection": 0, "basic_2step": 42, "sequential_3step": 48, "conditional_routing": 48, "sequential_reasoning": 41, "error_recovery": 0, "data_gap_recovery": 31, "data_gap_recovery_extended": 0, "argument_transformation": 6, "grounded_synthesis": 7, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 35, "tool_selection_stateful": 1, "basic_2step_stateful": 49, "sequential_3step_stateful": 44, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 42, "argument_fidelity": 35, "tool_selection": 0, "basic_2step": 42, "sequential_3step": 49, "conditional_routing": 49, "sequential_reasoning": 41, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 47, "argument_transformation": 30, "grounded_synthesis": 41, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 35, "tool_selection_stateful": 1, "basic_2step_stateful": 49, "sequential_3step_stateful": 46, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 42, "argument_fidelity": 35, "tool_selection": 0, "basic_2step": 42, "sequential_3step": 49, "conditional_routing": 49, "sequential_reasoning": 41, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 47, "argument_transformation": 30, "grounded_synthesis": 41, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 35, "tool_selection_stateful": 1, "basic_2step_stateful": 49, "sequential_3step_stateful": 46, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 42, "argument_fidelity": 105, "tool_selection": 0, "basic_2step": 84, "sequential_3step": 144, "conditional_routing": 192, "sequential_reasoning": 164, "error_recovery": 0, "data_gap_recovery": 155, "data_gap_recovery_extended": 0, "argument_transformation": 30, "grounded_synthesis": 70, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 105, "tool_selection_stateful": 3, "basic_2step_stateful": 98, "sequential_3step_stateful": 132, "conditional_routing_stateful": 168, "sequential_reasoning_stateful": 172, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 185, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 80, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 42, "argument_fidelity": 105, "tool_selection": 0, "basic_2step": 84, "sequential_3step": 144, "conditional_routing": 236, "sequential_reasoning": 164, "error_recovery": 0, "data_gap_recovery": 149, "data_gap_recovery_extended": 0, "argument_transformation": 37, "grounded_synthesis": 42, "inconsistent_api_recovery": 14, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 105, "tool_selection_stateful": 3, "basic_2step_stateful": 98, "sequential_3step_stateful": 132, "conditional_routing_stateful": 207, "sequential_reasoning_stateful": 172, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 162, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 13, "grounded_synthesis_stateful": 51, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 44.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 9.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 14.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 40.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 9.0, "data_gap_recovery_extended_stateful": 2.0, "argument_transformation_stateful": 7.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 2.0}, "scenarioWastedN": {"relevance_detection": 42, "argument_fidelity": 35, "tool_selection": 0, "basic_2step": 42, "sequential_3step": 49, "conditional_routing": 49, "sequential_reasoning": 41, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 47, "argument_transformation": 30, "grounded_synthesis": 41, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 35, "tool_selection_stateful": 1, "basic_2step_stateful": 49, "sequential_3step_stateful": 46, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 128.73, "argument_fidelity": 323.51, "tool_selection": 0.0, "basic_2step": 204.91, "sequential_3step": 729.78, "conditional_routing": 1249.78, "sequential_reasoning": 652.92, "error_recovery": 0.0, "data_gap_recovery": 1050.4, "data_gap_recovery_extended": 1219.21, "argument_transformation": 1361.17, "grounded_synthesis": 1730.5, "inconsistent_api_recovery": 1069.87, "relevance_detection_stateful": 150.91, "argument_fidelity_stateful": 329.25, "tool_selection_stateful": 5.45, "basic_2step_stateful": 268.72, "sequential_3step_stateful": 685.49, "conditional_routing_stateful": 1210.25, "sequential_reasoning_stateful": 709.11, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1069.56, "data_gap_recovery_extended_stateful": 1256.46, "argument_transformation_stateful": 1971.86, "grounded_synthesis_stateful": 1780.16, "inconsistent_api_recovery_stateful": 1052.22}, "scenarioSpeedN": {"relevance_detection": 42, "argument_fidelity": 35, "tool_selection": 0, "basic_2step": 42, "sequential_3step": 49, "conditional_routing": 49, "sequential_reasoning": 41, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 47, "argument_transformation": 30, "grounded_synthesis": 41, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 35, "tool_selection_stateful": 1, "basic_2step_stateful": 49, "sequential_3step_stateful": 46, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}}, {"label": "claude-haiku-4-5-20251001 AN/N [bare]", "model": "claude-haiku-4-5-20251001", "backend": "anthropic", "mode": "native", "ablation": "bare", "replay": "none", "family": "claude", "quant": "n/a", "gen": 3, "retired": false, "score": 47.2, "accuracy": 87.1, "completeness": 54.2, "efficiency": 100.0, "wasted": 0.0, "speed": 7.4, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 2, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 2, "argument_transformation": 82, "grounded_synthesis": 100, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 98, "tool_selection_stateful": 100, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 1, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 1, "argument_transformation": 41, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 1, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 2, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 1, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 2, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 2, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 8, "argument_transformation": 205, "grounded_synthesis": 500, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 147, "tool_selection_stateful": 150, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 105, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 2, "sequential_3step": 150, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 4, "argument_transformation": 131, "grounded_synthesis": 168, "inconsistent_api_recovery": 250, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 147, "tool_selection_stateful": 150, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 73, "grounded_synthesis_stateful": 173, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 1, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 2, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 171.0, "tool_selection": 171.74, "basic_2step": 1.98, "sequential_3step": 202.43, "conditional_routing": 338.12, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 15.06, "argument_transformation": 496.51, "grounded_synthesis": 782.81, "inconsistent_api_recovery": 432.92, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 181.64, "tool_selection_stateful": 170.88, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 207.24, "conditional_routing_stateful": 332.3, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 33.73, "argument_transformation_stateful": 458.54, "grounded_synthesis_stateful": 768.74, "inconsistent_api_recovery_stateful": 421.85}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 1, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 2, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.6-27B-Q4_K_M LS/N [bare:full]", "model": "Qwen3.6-27B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "qwen3.6-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 47.0, "accuracy": 92.4, "completeness": 50.8, "efficiency": 100.0, "wasted": 0.0, "speed": 26.3, "n": 50, "scenarios": {"relevance_detection": 52, "argument_fidelity": 88, "tool_selection": 82, "basic_2step": 68, "sequential_3step": 28, "conditional_routing": 42, "sequential_reasoning": 48, "error_recovery": 0, "data_gap_recovery": 72, "data_gap_recovery_extended": 26, "argument_transformation": 22, "grounded_synthesis": 62, "inconsistent_api_recovery": 22, "relevance_detection_stateful": 62, "argument_fidelity_stateful": 88, "tool_selection_stateful": 80, "basic_2step_stateful": 66, "sequential_3step_stateful": 40, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 54, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 80, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 62, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 41, "basic_2step": 34, "sequential_3step": 14, "conditional_routing": 21, "sequential_reasoning": 24, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 13, "argument_transformation": 11, "grounded_synthesis": 31, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 31, "argument_fidelity_stateful": 44, "tool_selection_stateful": 40, "basic_2step_stateful": 33, "sequential_3step_stateful": 20, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 27, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 41, "basic_2step": 34, "sequential_3step": 14, "conditional_routing": 21, "sequential_reasoning": 24, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 33, "argument_transformation": 12, "grounded_synthesis": 31, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 31, "argument_fidelity_stateful": 44, "tool_selection_stateful": 40, "basic_2step_stateful": 33, "sequential_3step_stateful": 20, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 27, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 30, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 9}, "scenarioValidated": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 41, "basic_2step": 34, "sequential_3step": 14, "conditional_routing": 21, "sequential_reasoning": 24, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 33, "argument_transformation": 12, "grounded_synthesis": 31, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 31, "argument_fidelity_stateful": 44, "tool_selection_stateful": 40, "basic_2step_stateful": 33, "sequential_3step_stateful": 20, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 27, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 30, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 9}, "scenarioIdealCalls": {"relevance_detection": 26, "argument_fidelity": 132, "tool_selection": 123, "basic_2step": 68, "sequential_3step": 42, "conditional_routing": 84, "sequential_reasoning": 96, "error_recovery": 0, "data_gap_recovery": 180, "data_gap_recovery_extended": 104, "argument_transformation": 55, "grounded_synthesis": 310, "inconsistent_api_recovery": 88, "relevance_detection_stateful": 31, "argument_fidelity_stateful": 132, "tool_selection_stateful": 120, "basic_2step_stateful": 66, "sequential_3step_stateful": 60, "conditional_routing_stateful": 84, "sequential_reasoning_stateful": 108, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 200, "data_gap_recovery_extended_stateful": 96, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 310, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 26, "argument_fidelity": 132, "tool_selection": 123, "basic_2step": 68, "sequential_3step": 42, "conditional_routing": 48, "sequential_reasoning": 96, "error_recovery": 0, "data_gap_recovery": 115, "data_gap_recovery_extended": 47, "argument_transformation": 38, "grounded_synthesis": 104, "inconsistent_api_recovery": 57, "relevance_detection_stateful": 31, "argument_fidelity_stateful": 132, "tool_selection_stateful": 120, "basic_2step_stateful": 66, "sequential_3step_stateful": 60, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 108, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 134, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 21, "grounded_synthesis_stateful": 105, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 1.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 2.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 1.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 41, "basic_2step": 34, "sequential_3step": 14, "conditional_routing": 21, "sequential_reasoning": 24, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 33, "argument_transformation": 12, "grounded_synthesis": 31, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 31, "argument_fidelity_stateful": 44, "tool_selection_stateful": 40, "basic_2step_stateful": 33, "sequential_3step_stateful": 20, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 27, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 30, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 9}, "scenarioSpeedSum": {"relevance_detection": 835.3, "argument_fidelity": 476.07, "tool_selection": 304.38, "basic_2step": 217.69, "sequential_3step": 224.89, "conditional_routing": 576.75, "sequential_reasoning": 463.44, "error_recovery": 0.0, "data_gap_recovery": 1217.85, "data_gap_recovery_extended": 934.07, "argument_transformation": 993.66, "grounded_synthesis": 1928.32, "inconsistent_api_recovery": 571.12, "relevance_detection_stateful": 959.43, "argument_fidelity_stateful": 470.08, "tool_selection_stateful": 280.4, "basic_2step_stateful": 216.82, "sequential_3step_stateful": 261.62, "conditional_routing_stateful": 566.35, "sequential_reasoning_stateful": 492.5, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1396.69, "data_gap_recovery_extended_stateful": 880.34, "argument_transformation_stateful": 648.32, "grounded_synthesis_stateful": 1870.55, "inconsistent_api_recovery_stateful": 568.7}, "scenarioSpeedN": {"relevance_detection": 26, "argument_fidelity": 44, "tool_selection": 41, "basic_2step": 34, "sequential_3step": 14, "conditional_routing": 21, "sequential_reasoning": 24, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 33, "argument_transformation": 12, "grounded_synthesis": 31, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 31, "argument_fidelity_stateful": 44, "tool_selection_stateful": 40, "basic_2step_stateful": 33, "sequential_3step_stateful": 20, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 27, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 30, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 9}}, {"label": "Mistral-7B-Instruct-v0.3.Q8_0 LF/P [reforged:full]", "model": "Mistral-7B-Instruct-v0.3.Q8_0", "backend": "llamafile", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 45.7, "accuracy": 47.0, "completeness": 97.2, "efficiency": 89.6, "wasted": 0.7, "speed": 8.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 42, "sequential_reasoning": 64, "error_recovery": 4, "data_gap_recovery": 30, "data_gap_recovery_extended": 6, "argument_transformation": 0, "grounded_synthesis": 12, "inconsistent_api_recovery": 20, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 98, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 96, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 12, "error_recovery_stateful": 4, "data_gap_recovery_stateful": 26, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 12}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 21, "sequential_reasoning": 32, "error_recovery": 2, "data_gap_recovery": 15, "data_gap_recovery_extended": 3, "argument_transformation": 0, "grounded_synthesis": 6, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 25, "sequential_reasoning_stateful": 6, "error_recovery_stateful": 2, "data_gap_recovery_stateful": 13, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 6}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 43, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 43, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 84, "sequential_reasoning": 128, "error_recovery": 4, "data_gap_recovery": 75, "data_gap_recovery_extended": 24, "argument_transformation": 0, "grounded_synthesis": 60, "inconsistent_api_recovery": 80, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 144, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 24, "error_recovery_stateful": 6, "data_gap_recovery_stateful": 65, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 48}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 200, "tool_selection": 100, "basic_2step": 150, "sequential_3step": 224, "conditional_routing": 54, "sequential_reasoning": 82, "error_recovery": 8, "data_gap_recovery": 83, "data_gap_recovery_extended": 25, "argument_transformation": 0, "grounded_synthesis": 35, "inconsistent_api_recovery": 117, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 196, "tool_selection_stateful": 0, "basic_2step_stateful": 150, "sequential_3step_stateful": 197, "conditional_routing_stateful": 70, "sequential_reasoning_stateful": 29, "error_recovery_stateful": 8, "data_gap_recovery_stateful": 73, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 72}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 50.0, "tool_selection": 0.0, "basic_2step": 50.0, "sequential_3step": 76.0, "conditional_routing": 36.0, "sequential_reasoning": 10.0, "error_recovery": 103.0, "data_gap_recovery": 31.0, "data_gap_recovery_extended": 29.0, "argument_transformation": 16.0, "grounded_synthesis": 22.0, "inconsistent_api_recovery": 72.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 49.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 54.0, "conditional_routing_stateful": 39.0, "sequential_reasoning_stateful": 12.0, "error_recovery_stateful": 54.0, "data_gap_recovery_stateful": 29.0, "data_gap_recovery_extended_stateful": 12.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 48.0, "inconsistent_api_recovery_stateful": 67.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 43, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 42.78, "argument_fidelity": 172.23, "tool_selection": 75.95, "basic_2step": 90.5, "sequential_3step": 179.35, "conditional_routing": 339.16, "sequential_reasoning": 180.03, "error_recovery": 108.96, "data_gap_recovery": 433.06, "data_gap_recovery_extended": 707.71, "argument_transformation": 948.56, "grounded_synthesis": 1134.66, "inconsistent_api_recovery": 849.08, "relevance_detection_stateful": 39.5, "argument_fidelity_stateful": 167.96, "tool_selection_stateful": 76.46, "basic_2step_stateful": 104.79, "sequential_3step_stateful": 160.15, "conditional_routing_stateful": 339.01, "sequential_reasoning_stateful": 158.07, "error_recovery_stateful": 112.23, "data_gap_recovery_stateful": 413.18, "data_gap_recovery_extended_stateful": 618.05, "argument_transformation_stateful": 1017.89, "grounded_synthesis_stateful": 1135.63, "inconsistent_api_recovery_stateful": 805.87}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 43, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-7B-Instruct-v0.3-Q8_0 LS/P [reforged:full]", "model": "Mistral-7B-Instruct-v0.3-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 46.1, "accuracy": 47.8, "completeness": 96.5, "efficiency": 89.9, "wasted": 0.7, "speed": 4.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 98, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 42, "sequential_reasoning": 76, "error_recovery": 14, "data_gap_recovery": 28, "data_gap_recovery_extended": 4, "argument_transformation": 0, "grounded_synthesis": 16, "inconsistent_api_recovery": 22, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 86, "conditional_routing_stateful": 38, "sequential_reasoning_stateful": 18, "error_recovery_stateful": 12, "data_gap_recovery_stateful": 20, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 10}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 21, "sequential_reasoning": 38, "error_recovery": 7, "data_gap_recovery": 14, "data_gap_recovery_extended": 2, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 43, "conditional_routing_stateful": 19, "sequential_reasoning_stateful": 9, "error_recovery_stateful": 6, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 5}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 46, "argument_transformation": 49, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 46, "argument_transformation": 49, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 147, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 84, "sequential_reasoning": 152, "error_recovery": 14, "data_gap_recovery": 70, "data_gap_recovery_extended": 16, "argument_transformation": 0, "grounded_synthesis": 80, "inconsistent_api_recovery": 88, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 129, "conditional_routing_stateful": 76, "sequential_reasoning_stateful": 36, "error_recovery_stateful": 18, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 60, "inconsistent_api_recovery_stateful": 40}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 196, "tool_selection": 100, "basic_2step": 150, "sequential_3step": 218, "conditional_routing": 54, "sequential_reasoning": 104, "error_recovery": 28, "data_gap_recovery": 70, "data_gap_recovery_extended": 7, "argument_transformation": 0, "grounded_synthesis": 65, "inconsistent_api_recovery": 122, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 200, "tool_selection_stateful": 0, "basic_2step_stateful": 149, "sequential_3step_stateful": 182, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 42, "error_recovery_stateful": 24, "data_gap_recovery_stateful": 60, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 64}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 50.0, "tool_selection": 0.0, "basic_2step": 50.0, "sequential_3step": 72.0, "conditional_routing": 35.0, "sequential_reasoning": 12.0, "error_recovery": 101.0, "data_gap_recovery": 32.0, "data_gap_recovery_extended": 24.0, "argument_transformation": 19.0, "grounded_synthesis": 31.0, "inconsistent_api_recovery": 90.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 50.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 49.0, "sequential_3step_stateful": 53.0, "conditional_routing_stateful": 42.0, "sequential_reasoning_stateful": 8.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 37.0, "data_gap_recovery_extended_stateful": 18.0, "argument_transformation_stateful": 6.0, "grounded_synthesis_stateful": 21.0, "inconsistent_api_recovery_stateful": 78.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 46, "argument_transformation": 49, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 22.96, "argument_fidelity": 97.21, "tool_selection": 41.66, "basic_2step": 49.08, "sequential_3step": 100.63, "conditional_routing": 190.9, "sequential_reasoning": 109.43, "error_recovery": 59.44, "data_gap_recovery": 275.79, "data_gap_recovery_extended": 377.75, "argument_transformation": 519.84, "grounded_synthesis": 653.94, "inconsistent_api_recovery": 492.76, "relevance_detection_stateful": 23.11, "argument_fidelity_stateful": 97.91, "tool_selection_stateful": 40.37, "basic_2step_stateful": 57.73, "sequential_3step_stateful": 80.09, "conditional_routing_stateful": 198.39, "sequential_reasoning_stateful": 90.01, "error_recovery_stateful": 58.18, "data_gap_recovery_stateful": 229.13, "data_gap_recovery_extended_stateful": 332.53, "argument_transformation_stateful": 537.36, "grounded_synthesis_stateful": 684.16, "inconsistent_api_recovery_stateful": 495.15}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 46, "argument_transformation": 49, "grounded_synthesis": 48, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 37, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}}, {"label": "granite-4.1-8b-Q4_K_M LS/P [bare]", "model": "granite-4.1-8b-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "none", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 46.2, "accuracy": 50.0, "completeness": 92.3, "efficiency": 100.0, "wasted": 0.0, "speed": 2.0, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 100, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 100, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 250, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 250, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 150, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 150, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 55.58, "argument_fidelity": 40.59, "tool_selection": 32.65, "basic_2step": 20.57, "sequential_3step": 41.03, "conditional_routing": 108.67, "sequential_reasoning": 54.08, "error_recovery": 0.0, "data_gap_recovery": 37.56, "data_gap_recovery_extended": 108.11, "argument_transformation": 479.92, "grounded_synthesis": 149.18, "inconsistent_api_recovery": 74.58, "relevance_detection_stateful": 58.01, "argument_fidelity_stateful": 41.05, "tool_selection_stateful": 36.65, "basic_2step_stateful": 22.44, "sequential_3step_stateful": 41.03, "conditional_routing_stateful": 103.46, "sequential_reasoning_stateful": 53.69, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 38.52, "data_gap_recovery_extended_stateful": 108.04, "argument_transformation_stateful": 479.76, "grounded_synthesis_stateful": 165.01, "inconsistent_api_recovery_stateful": 79.12}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite4.1:8b-q8_0 OL/N [bare:full]", "model": "granite4.1:8b-q8_0", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "granite-4.1-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 46.2, "accuracy": 60.0, "completeness": 76.9, "efficiency": 95.5, "wasted": 0.7, "speed": 3.1, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 108.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 50.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 108.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 50.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 73.35, "tool_selection": 0.0, "basic_2step": 44.85, "sequential_3step": 68.11, "conditional_routing": 160.59, "sequential_reasoning": 80.53, "error_recovery": 0.0, "data_gap_recovery": 219.63, "data_gap_recovery_extended": 206.7, "argument_transformation": 181.55, "grounded_synthesis": 312.01, "inconsistent_api_recovery": 213.27, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 73.33, "tool_selection_stateful": 0.0, "basic_2step_stateful": 43.04, "sequential_3step_stateful": 68.31, "conditional_routing_stateful": 168.47, "sequential_reasoning_stateful": 80.55, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 219.63, "data_gap_recovery_extended_stateful": 206.71, "argument_transformation_stateful": 181.6, "grounded_synthesis_stateful": 312.06, "inconsistent_api_recovery_stateful": 213.25}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.1-8b-Q8_0 LS/N [bare]", "model": "granite-4.1-8b-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "none", "family": "granite-4.1-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 46.2, "accuracy": 60.0, "completeness": 77.0, "efficiency": 95.5, "wasted": 1.1, "speed": 3.2, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 2, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 3, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 3, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 397.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 50.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 250.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 50.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 63.11, "tool_selection": 1.16, "basic_2step": 34.57, "sequential_3step": 52.59, "conditional_routing": 184.5, "sequential_reasoning": 80.62, "error_recovery": 0.0, "data_gap_recovery": 199.62, "data_gap_recovery_extended": 249.26, "argument_transformation": 249.67, "grounded_synthesis": 293.54, "inconsistent_api_recovery": 196.13, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 64.25, "tool_selection_stateful": 0.0, "basic_2step_stateful": 38.05, "sequential_3step_stateful": 52.54, "conditional_routing_stateful": 191.15, "sequential_reasoning_stateful": 80.32, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 204.11, "data_gap_recovery_extended_stateful": 244.79, "argument_transformation_stateful": 217.98, "grounded_synthesis_stateful": 306.99, "inconsistent_api_recovery_stateful": 200.56}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 1, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "qwen3:14b-q4_K_M OL/N [bare:full]", "model": "qwen3:14b-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 46.5, "accuracy": 61.3, "completeness": 75.8, "efficiency": 87.4, "wasted": 0.7, "speed": 34.7, "n": 50, "scenarios": {"relevance_detection": 90, "argument_fidelity": 92, "tool_selection": 4, "basic_2step": 4, "sequential_3step": 86, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 78, "data_gap_recovery_extended": 6, "argument_transformation": 4, "grounded_synthesis": 48, "inconsistent_api_recovery": 6, "relevance_detection_stateful": 86, "argument_fidelity_stateful": 92, "tool_selection_stateful": 2, "basic_2step_stateful": 26, "sequential_3step_stateful": 84, "conditional_routing_stateful": 68, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 86, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 45, "argument_fidelity": 46, "tool_selection": 2, "basic_2step": 2, "sequential_3step": 43, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 3, "argument_transformation": 2, "grounded_synthesis": 24, "inconsistent_api_recovery": 3, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 46, "tool_selection_stateful": 1, "basic_2step_stateful": 13, "sequential_3step_stateful": 42, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 21, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 45, "argument_fidelity": 46, "tool_selection": 2, "basic_2step": 2, "sequential_3step": 47, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 7, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 37, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 46, "tool_selection_stateful": 1, "basic_2step_stateful": 13, "sequential_3step_stateful": 45, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 45, "argument_fidelity": 46, "tool_selection": 2, "basic_2step": 2, "sequential_3step": 47, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 7, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 37, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 46, "tool_selection_stateful": 1, "basic_2step_stateful": 13, "sequential_3step_stateful": 45, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 45, "argument_fidelity": 138, "tool_selection": 6, "basic_2step": 4, "sequential_3step": 129, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 195, "data_gap_recovery_extended": 24, "argument_transformation": 10, "grounded_synthesis": 240, "inconsistent_api_recovery": 24, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 138, "tool_selection_stateful": 3, "basic_2step_stateful": 26, "sequential_3step_stateful": 126, "conditional_routing_stateful": 136, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 215, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 210, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 45, "argument_fidelity": 138, "tool_selection": 6, "basic_2step": 4, "sequential_3step": 129, "conditional_routing": 227, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 189, "data_gap_recovery_extended": 22, "argument_transformation": 8, "grounded_synthesis": 392, "inconsistent_api_recovery": 26, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 138, "tool_selection_stateful": 3, "basic_2step_stateful": 26, "sequential_3step_stateful": 126, "conditional_routing_stateful": 169, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 213, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 342, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 29.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 291.0, "inconsistent_api_recovery": 3.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 33.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 280.0, "inconsistent_api_recovery_stateful": 4.0}, "scenarioWastedN": {"relevance_detection": 45, "argument_fidelity": 46, "tool_selection": 2, "basic_2step": 2, "sequential_3step": 47, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 7, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 37, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 46, "tool_selection_stateful": 1, "basic_2step_stateful": 13, "sequential_3step_stateful": 45, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 92.79, "argument_fidelity": 332.37, "tool_selection": 15.04, "basic_2step": 14.6, "sequential_3step": 654.97, "conditional_routing": 1250.52, "sequential_reasoning": 815.49, "error_recovery": 26.64, "data_gap_recovery": 1122.07, "data_gap_recovery_extended": 1668.12, "argument_transformation": 2277.67, "grounded_synthesis": 7149.99, "inconsistent_api_recovery": 1188.58, "relevance_detection_stateful": 88.72, "argument_fidelity_stateful": 356.57, "tool_selection_stateful": 7.94, "basic_2step_stateful": 60.78, "sequential_3step_stateful": 733.63, "conditional_routing_stateful": 1427.97, "sequential_reasoning_stateful": 872.97, "error_recovery_stateful": 28.54, "data_gap_recovery_stateful": 1170.31, "data_gap_recovery_extended_stateful": 1790.73, "argument_transformation_stateful": 2999.66, "grounded_synthesis_stateful": 6843.9, "inconsistent_api_recovery_stateful": 1149.66}, "scenarioSpeedN": {"relevance_detection": 45, "argument_fidelity": 46, "tool_selection": 2, "basic_2step": 2, "sequential_3step": 47, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 7, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 37, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 43, "argument_fidelity_stateful": 46, "tool_selection_stateful": 1, "basic_2step_stateful": 13, "sequential_3step_stateful": 45, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q4_K_M LS/N [bare:keep-last]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "keep-last", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 46.4, "accuracy": 64.3, "completeness": 72.2, "efficiency": 100.0, "wasted": 0.1, "speed": 15.0, "n": 50, "scenarios": {"relevance_detection": 96, "argument_fidelity": 78, "tool_selection": 2, "basic_2step": 86, "sequential_3step": 70, "conditional_routing": 94, "sequential_reasoning": 76, "error_recovery": 0, "data_gap_recovery": 74, "data_gap_recovery_extended": 4, "argument_transformation": 34, "grounded_synthesis": 4, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 80, "argument_fidelity_stateful": 76, "tool_selection_stateful": 4, "basic_2step_stateful": 76, "sequential_3step_stateful": 76, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 86, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 76, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 48, "argument_fidelity": 39, "tool_selection": 1, "basic_2step": 43, "sequential_3step": 35, "conditional_routing": 47, "sequential_reasoning": 38, "error_recovery": 0, "data_gap_recovery": 37, "data_gap_recovery_extended": 2, "argument_transformation": 17, "grounded_synthesis": 2, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 40, "argument_fidelity_stateful": 38, "tool_selection_stateful": 2, "basic_2step_stateful": 38, "sequential_3step_stateful": 38, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 38, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 48, "argument_fidelity": 39, "tool_selection": 1, "basic_2step": 43, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 38, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 43, "argument_transformation": 37, "grounded_synthesis": 41, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 40, "argument_fidelity_stateful": 38, "tool_selection_stateful": 2, "basic_2step_stateful": 38, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 41, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 41, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 48}, "scenarioValidated": {"relevance_detection": 48, "argument_fidelity": 39, "tool_selection": 1, "basic_2step": 43, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 38, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 43, "argument_transformation": 37, "grounded_synthesis": 41, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 40, "argument_fidelity_stateful": 38, "tool_selection_stateful": 2, "basic_2step_stateful": 38, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 41, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 41, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 48}, "scenarioIdealCalls": {"relevance_detection": 48, "argument_fidelity": 117, "tool_selection": 3, "basic_2step": 86, "sequential_3step": 105, "conditional_routing": 188, "sequential_reasoning": 152, "error_recovery": 0, "data_gap_recovery": 185, "data_gap_recovery_extended": 16, "argument_transformation": 85, "grounded_synthesis": 20, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 40, "argument_fidelity_stateful": 114, "tool_selection_stateful": 6, "basic_2step_stateful": 76, "sequential_3step_stateful": 114, "conditional_routing_stateful": 192, "sequential_reasoning_stateful": 172, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 190, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 48, "argument_fidelity": 117, "tool_selection": 3, "basic_2step": 86, "sequential_3step": 105, "conditional_routing": 231, "sequential_reasoning": 152, "error_recovery": 0, "data_gap_recovery": 155, "data_gap_recovery_extended": 10, "argument_transformation": 69, "grounded_synthesis": 10, "inconsistent_api_recovery": 35, "relevance_detection_stateful": 40, "argument_fidelity_stateful": 114, "tool_selection_stateful": 6, "basic_2step_stateful": 76, "sequential_3step_stateful": 114, "conditional_routing_stateful": 239, "sequential_reasoning_stateful": 172, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 160, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 14, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 43.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 2.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 7.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 47.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 6.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 3.0, "grounded_synthesis_stateful": 1.0, "inconsistent_api_recovery_stateful": 4.0}, "scenarioWastedN": {"relevance_detection": 48, "argument_fidelity": 39, "tool_selection": 1, "basic_2step": 43, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 38, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 43, "argument_transformation": 37, "grounded_synthesis": 41, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 40, "argument_fidelity_stateful": 38, "tool_selection_stateful": 2, "basic_2step_stateful": 38, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 41, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 41, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 48}, "scenarioSpeedSum": {"relevance_detection": 55.73, "argument_fidelity": 261.07, "tool_selection": 4.7, "basic_2step": 119.43, "sequential_3step": 344.43, "conditional_routing": 946.51, "sequential_reasoning": 470.59, "error_recovery": 0.0, "data_gap_recovery": 612.34, "data_gap_recovery_extended": 898.8, "argument_transformation": 1393.5, "grounded_synthesis": 1074.47, "inconsistent_api_recovery": 804.61, "relevance_detection_stateful": 43.49, "argument_fidelity_stateful": 241.18, "tool_selection_stateful": 8.73, "basic_2step_stateful": 121.55, "sequential_3step_stateful": 386.34, "conditional_routing_stateful": 939.99, "sequential_reasoning_stateful": 519.52, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 584.92, "data_gap_recovery_extended_stateful": 899.64, "argument_transformation_stateful": 1457.44, "grounded_synthesis_stateful": 1162.71, "inconsistent_api_recovery_stateful": 740.2}, "scenarioSpeedN": {"relevance_detection": 48, "argument_fidelity": 39, "tool_selection": 1, "basic_2step": 43, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 38, "error_recovery": 0, "data_gap_recovery": 43, "data_gap_recovery_extended": 43, "argument_transformation": 37, "grounded_synthesis": 41, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 40, "argument_fidelity_stateful": 38, "tool_selection_stateful": 2, "basic_2step_stateful": 38, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 41, "data_gap_recovery_extended_stateful": 43, "argument_transformation_stateful": 41, "grounded_synthesis_stateful": 41, "inconsistent_api_recovery_stateful": 48}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [bare:full]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 45.6, "accuracy": 80.2, "completeness": 56.8, "efficiency": 100.0, "wasted": 0.3, "speed": 5.8, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 72, "basic_2step": 78, "sequential_3step": 98, "conditional_routing": 72, "sequential_reasoning": 6, "error_recovery": 0, "data_gap_recovery": 26, "data_gap_recovery_extended": 18, "argument_transformation": 34, "grounded_synthesis": 36, "inconsistent_api_recovery": 94, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 44, "basic_2step_stateful": 96, "sequential_3step_stateful": 92, "conditional_routing_stateful": 84, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 28, "argument_transformation_stateful": 26, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 24}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 36, "basic_2step": 39, "sequential_3step": 49, "conditional_routing": 36, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 13, "data_gap_recovery_extended": 9, "argument_transformation": 17, "grounded_synthesis": 18, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 22, "basic_2step_stateful": 48, "sequential_3step_stateful": 46, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 8, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 13, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 12}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 36, "basic_2step": 39, "sequential_3step": 49, "conditional_routing": 38, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 17, "data_gap_recovery_extended": 16, "argument_transformation": 46, "grounded_synthesis": 27, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 22, "basic_2step_stateful": 48, "sequential_3step_stateful": 46, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 36, "basic_2step": 39, "sequential_3step": 49, "conditional_routing": 38, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 17, "data_gap_recovery_extended": 16, "argument_transformation": 46, "grounded_synthesis": 27, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 22, "basic_2step_stateful": 48, "sequential_3step_stateful": 46, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 108, "basic_2step": 78, "sequential_3step": 147, "conditional_routing": 144, "sequential_reasoning": 12, "error_recovery": 0, "data_gap_recovery": 65, "data_gap_recovery_extended": 72, "argument_transformation": 85, "grounded_synthesis": 180, "inconsistent_api_recovery": 376, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 66, "basic_2step_stateful": 96, "sequential_3step_stateful": 138, "conditional_routing_stateful": 168, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 112, "argument_transformation_stateful": 65, "grounded_synthesis_stateful": 200, "inconsistent_api_recovery_stateful": 96}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 108, "basic_2step": 78, "sequential_3step": 147, "conditional_routing": 144, "sequential_reasoning": 12, "error_recovery": 0, "data_gap_recovery": 62, "data_gap_recovery_extended": 50, "argument_transformation": 91, "grounded_synthesis": 98, "inconsistent_api_recovery": 430, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 66, "basic_2step_stateful": 96, "sequential_3step_stateful": 138, "conditional_routing_stateful": 167, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 83, "argument_transformation_stateful": 65, "grounded_synthesis_stateful": 113, "inconsistent_api_recovery_stateful": 138}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 17.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 15.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 93.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 20.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 7.0, "grounded_synthesis_stateful": 8.0, "inconsistent_api_recovery_stateful": 81.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 36, "basic_2step": 39, "sequential_3step": 49, "conditional_routing": 38, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 17, "data_gap_recovery_extended": 16, "argument_transformation": 46, "grounded_synthesis": 27, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 22, "basic_2step_stateful": 48, "sequential_3step_stateful": 46, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 46.51, "tool_selection": 25.91, "basic_2step": 17.56, "sequential_3step": 65.6, "conditional_routing": 235.29, "sequential_reasoning": 9.98, "error_recovery": 0.0, "data_gap_recovery": 92.41, "data_gap_recovery_extended": 173.02, "argument_transformation": 632.31, "grounded_synthesis": 371.7, "inconsistent_api_recovery": 444.12, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 46.42, "tool_selection_stateful": 15.35, "basic_2step_stateful": 24.22, "sequential_3step_stateful": 53.35, "conditional_routing_stateful": 240.92, "sequential_reasoning_stateful": 2.23, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 62.35, "data_gap_recovery_extended_stateful": 202.31, "argument_transformation_stateful": 643.92, "grounded_synthesis_stateful": 457.72, "inconsistent_api_recovery_stateful": 395.89}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 36, "basic_2step": 39, "sequential_3step": 49, "conditional_routing": 38, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 17, "data_gap_recovery_extended": 16, "argument_transformation": 46, "grounded_synthesis": 27, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 22, "basic_2step_stateful": 48, "sequential_3step_stateful": 46, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 49}}, {"label": "Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [bare]", "model": "Ministral-3-8B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "none", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 45.8, "accuracy": 81.5, "completeness": 56.2, "efficiency": 94.3, "wasted": 0.7, "speed": 2.6, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 2, "error_recovery": 0, "data_gap_recovery": 98, "data_gap_recovery_extended": 0, "argument_transformation": 16, "grounded_synthesis": 36, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 98, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 18, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 19, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 0, "argument_transformation": 47, "grounded_synthesis": 19, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 19, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 0, "argument_transformation": 47, "grounded_synthesis": 19, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 19, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 245, "data_gap_recovery_extended": 0, "argument_transformation": 40, "grounded_synthesis": 180, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 8, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 245, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 190, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 147, "data_gap_recovery_extended": 0, "argument_transformation": 34, "grounded_synthesis": 229, "inconsistent_api_recovery": 547, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 8, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 147, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 247, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 16.0, "grounded_synthesis": 54.0, "inconsistent_api_recovery": 147.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 8.0, "grounded_synthesis_stateful": 57.0, "inconsistent_api_recovery_stateful": 150.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 0, "argument_transformation": 47, "grounded_synthesis": 19, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 19, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 38.16, "tool_selection": 0.0, "basic_2step": 19.05, "sequential_3step": 91.1, "conditional_routing": 112.24, "sequential_reasoning": 1.45, "error_recovery": 0.0, "data_gap_recovery": 120.58, "data_gap_recovery_extended": 0.0, "argument_transformation": 276.5, "grounded_synthesis": 94.64, "inconsistent_api_recovery": 186.98, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 38.09, "tool_selection_stateful": 0.0, "basic_2step_stateful": 21.12, "sequential_3step_stateful": 91.88, "conditional_routing_stateful": 111.1, "sequential_reasoning_stateful": 2.9, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 120.03, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 286.06, "grounded_synthesis_stateful": 95.67, "inconsistent_api_recovery_stateful": 187.39}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 0, "argument_transformation": 47, "grounded_synthesis": 19, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 19, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-8B-Q4_K_M LS/N [bare:full]", "model": "Qwen3-8B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 45.2, "accuracy": 63.7, "completeness": 71.0, "efficiency": 100.0, "wasted": 0.1, "speed": 13.6, "n": 50, "scenarios": {"relevance_detection": 92, "argument_fidelity": 80, "tool_selection": 2, "basic_2step": 86, "sequential_3step": 76, "conditional_routing": 100, "sequential_reasoning": 80, "error_recovery": 0, "data_gap_recovery": 74, "data_gap_recovery_extended": 0, "argument_transformation": 14, "grounded_synthesis": 12, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 88, "argument_fidelity_stateful": 68, "tool_selection_stateful": 4, "basic_2step_stateful": 74, "sequential_3step_stateful": 82, "conditional_routing_stateful": 94, "sequential_reasoning_stateful": 70, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 60, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 14, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 46, "argument_fidelity": 40, "tool_selection": 1, "basic_2step": 43, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 40, "error_recovery": 0, "data_gap_recovery": 37, "data_gap_recovery_extended": 0, "argument_transformation": 7, "grounded_synthesis": 6, "inconsistent_api_recovery": 1, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 34, "tool_selection_stateful": 2, "basic_2step_stateful": 37, "sequential_3step_stateful": 41, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 35, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 7, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 46, "argument_fidelity": 40, "tool_selection": 1, "basic_2step": 43, "sequential_3step": 41, "conditional_routing": 50, "sequential_reasoning": 40, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 39, "argument_transformation": 38, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 34, "tool_selection_stateful": 2, "basic_2step_stateful": 37, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 35, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 46, "argument_fidelity": 40, "tool_selection": 1, "basic_2step": 43, "sequential_3step": 41, "conditional_routing": 50, "sequential_reasoning": 40, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 39, "argument_transformation": 38, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 34, "tool_selection_stateful": 2, "basic_2step_stateful": 37, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 35, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 46, "argument_fidelity": 120, "tool_selection": 3, "basic_2step": 86, "sequential_3step": 114, "conditional_routing": 200, "sequential_reasoning": 160, "error_recovery": 0, "data_gap_recovery": 185, "data_gap_recovery_extended": 0, "argument_transformation": 35, "grounded_synthesis": 60, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 102, "tool_selection_stateful": 6, "basic_2step_stateful": 74, "sequential_3step_stateful": 123, "conditional_routing_stateful": 188, "sequential_reasoning_stateful": 140, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 150, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 70, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 46, "argument_fidelity": 120, "tool_selection": 3, "basic_2step": 86, "sequential_3step": 114, "conditional_routing": 247, "sequential_reasoning": 160, "error_recovery": 0, "data_gap_recovery": 154, "data_gap_recovery_extended": 0, "argument_transformation": 32, "grounded_synthesis": 40, "inconsistent_api_recovery": 9, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 102, "tool_selection_stateful": 6, "basic_2step_stateful": 74, "sequential_3step_stateful": 123, "conditional_routing_stateful": 234, "sequential_reasoning_stateful": 140, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 123, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 11, "grounded_synthesis_stateful": 56, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 47.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 6.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 4.0, "grounded_synthesis": 7.0, "inconsistent_api_recovery": 1.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 46.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 5.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 15.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 46, "argument_fidelity": 40, "tool_selection": 1, "basic_2step": 43, "sequential_3step": 41, "conditional_routing": 50, "sequential_reasoning": 40, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 39, "argument_transformation": 38, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 34, "tool_selection_stateful": 2, "basic_2step_stateful": 37, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 35, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 49.67, "argument_fidelity": 256.09, "tool_selection": 4.19, "basic_2step": 115.42, "sequential_3step": 376.02, "conditional_routing": 909.85, "sequential_reasoning": 467.99, "error_recovery": 0.0, "data_gap_recovery": 662.38, "data_gap_recovery_extended": 666.65, "argument_transformation": 1122.42, "grounded_synthesis": 1091.78, "inconsistent_api_recovery": 653.19, "relevance_detection_stateful": 45.98, "argument_fidelity_stateful": 203.66, "tool_selection_stateful": 11.83, "basic_2step_stateful": 112.1, "sequential_3step_stateful": 399.31, "conditional_routing_stateful": 957.75, "sequential_reasoning_stateful": 411.43, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 574.14, "data_gap_recovery_extended_stateful": 830.75, "argument_transformation_stateful": 1069.62, "grounded_synthesis_stateful": 987.22, "inconsistent_api_recovery_stateful": 589.68}, "scenarioSpeedN": {"relevance_detection": 46, "argument_fidelity": 40, "tool_selection": 1, "basic_2step": 43, "sequential_3step": 41, "conditional_routing": 50, "sequential_reasoning": 40, "error_recovery": 0, "data_gap_recovery": 47, "data_gap_recovery_extended": 39, "argument_transformation": 38, "grounded_synthesis": 37, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 44, "argument_fidelity_stateful": 34, "tool_selection_stateful": 2, "basic_2step_stateful": 37, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 35, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-7B-Instruct-v0.3.Q4_K_M LF/P [reforged:full]", "model": "Mistral-7B-Instruct-v0.3.Q4_K_M", "backend": "llamafile", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 44.2, "accuracy": 45.6, "completeness": 96.9, "efficiency": 86.7, "wasted": 0.7, "speed": 5.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 12, "sequential_reasoning": 82, "error_recovery": 16, "data_gap_recovery": 22, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 14, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 90, "conditional_routing_stateful": 26, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 18, "data_gap_recovery_stateful": 12, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 22, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 6, "sequential_reasoning": 41, "error_recovery": 8, "data_gap_recovery": 11, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 5, "inconsistent_api_recovery": 7, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 45, "conditional_routing_stateful": 13, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 11, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 45, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 45, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 24, "sequential_reasoning": 164, "error_recovery": 16, "data_gap_recovery": 55, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 56, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 135, "conditional_routing_stateful": 52, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 27, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 110, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 199, "tool_selection": 100, "basic_2step": 150, "sequential_3step": 212, "conditional_routing": 21, "sequential_reasoning": 142, "error_recovery": 32, "data_gap_recovery": 56, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 40, "inconsistent_api_recovery": 81, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 200, "tool_selection_stateful": 0, "basic_2step_stateful": 147, "sequential_3step_stateful": 186, "conditional_routing_stateful": 54, "sequential_reasoning_stateful": 53, "error_recovery_stateful": 36, "data_gap_recovery_stateful": 32, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 64, "inconsistent_api_recovery_stateful": 27}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 49.0, "tool_selection": 0.0, "basic_2step": 50.0, "sequential_3step": 64.0, "conditional_routing": 35.0, "sequential_reasoning": 37.0, "error_recovery": 102.0, "data_gap_recovery": 14.0, "data_gap_recovery_extended": 4.0, "argument_transformation": 2.0, "grounded_synthesis": 36.0, "inconsistent_api_recovery": 53.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 50.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 47.0, "sequential_3step_stateful": 54.0, "conditional_routing_stateful": 36.0, "sequential_reasoning_stateful": 21.0, "error_recovery_stateful": 52.0, "data_gap_recovery_stateful": 16.0, "data_gap_recovery_extended_stateful": 9.0, "argument_transformation_stateful": 5.0, "grounded_synthesis_stateful": 28.0, "inconsistent_api_recovery_stateful": 61.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 45, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 30.01, "argument_fidelity": 132.95, "tool_selection": 57.16, "basic_2step": 73.43, "sequential_3step": 135.85, "conditional_routing": 257.72, "sequential_reasoning": 156.75, "error_recovery": 94.68, "data_gap_recovery": 251.89, "data_gap_recovery_extended": 375.06, "argument_transformation": 630.96, "grounded_synthesis": 831.62, "inconsistent_api_recovery": 594.59, "relevance_detection_stateful": 30.3, "argument_fidelity_stateful": 133.22, "tool_selection_stateful": 57.68, "basic_2step_stateful": 88.08, "sequential_3step_stateful": 125.86, "conditional_routing_stateful": 257.93, "sequential_reasoning_stateful": 123.91, "error_recovery_stateful": 89.08, "data_gap_recovery_stateful": 245.63, "data_gap_recovery_extended_stateful": 389.59, "argument_transformation_stateful": 717.4, "grounded_synthesis_stateful": 799.3, "inconsistent_api_recovery_stateful": 605.87}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 36, "data_gap_recovery_extended": 45, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 49}}, {"label": "Mistral-7B-Instruct-v0.3-Q4_K_M LS/P [reforged:full]", "model": "Mistral-7B-Instruct-v0.3-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 43.9, "accuracy": 45.5, "completeness": 96.5, "efficiency": 84.6, "wasted": 0.7, "speed": 3.1, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 26, "sequential_reasoning": 84, "error_recovery": 12, "data_gap_recovery": 30, "data_gap_recovery_extended": 2, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 14, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 66, "conditional_routing_stateful": 18, "sequential_reasoning_stateful": 22, "error_recovery_stateful": 6, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 14, "inconsistent_api_recovery_stateful": 8}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 13, "sequential_reasoning": 42, "error_recovery": 6, "data_gap_recovery": 15, "data_gap_recovery_extended": 1, "argument_transformation": 0, "grounded_synthesis": 5, "inconsistent_api_recovery": 7, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 33, "conditional_routing_stateful": 9, "sequential_reasoning_stateful": 11, "error_recovery_stateful": 3, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 7, "inconsistent_api_recovery_stateful": 4}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 41, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 36, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 41, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 36, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 52, "sequential_reasoning": 168, "error_recovery": 12, "data_gap_recovery": 75, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 56, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 99, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 44, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 75, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 70, "inconsistent_api_recovery_stateful": 32}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 200, "tool_selection": 100, "basic_2step": 150, "sequential_3step": 250, "conditional_routing": 69, "sequential_reasoning": 103, "error_recovery": 24, "data_gap_recovery": 84, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 33, "inconsistent_api_recovery": 65, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 200, "tool_selection_stateful": 0, "basic_2step_stateful": 150, "sequential_3step_stateful": 149, "conditional_routing_stateful": 37, "sequential_reasoning_stateful": 68, "error_recovery_stateful": 12, "data_gap_recovery_stateful": 83, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 65, "inconsistent_api_recovery_stateful": 43}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 50.0, "tool_selection": 0.0, "basic_2step": 50.0, "sequential_3step": 100.0, "conditional_routing": 60.0, "sequential_reasoning": 14.0, "error_recovery": 100.0, "data_gap_recovery": 15.0, "data_gap_recovery_extended": 5.0, "argument_transformation": 3.0, "grounded_synthesis": 9.0, "inconsistent_api_recovery": 56.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 50.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 50.0, "sequential_3step_stateful": 60.0, "conditional_routing_stateful": 56.0, "sequential_reasoning_stateful": 27.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 21.0, "data_gap_recovery_extended_stateful": 15.0, "argument_transformation_stateful": 7.0, "grounded_synthesis_stateful": 46.0, "inconsistent_api_recovery_stateful": 56.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 41, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 36, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 13.75, "argument_fidelity": 63.3, "tool_selection": 26.66, "basic_2step": 33.66, "sequential_3step": 72.21, "conditional_routing": 152.98, "sequential_reasoning": 68.95, "error_recovery": 37.12, "data_gap_recovery": 149.42, "data_gap_recovery_extended": 203.01, "argument_transformation": 344.15, "grounded_synthesis": 443.08, "inconsistent_api_recovery": 311.11, "relevance_detection_stateful": 13.84, "argument_fidelity_stateful": 62.59, "tool_selection_stateful": 27.2, "basic_2step_stateful": 40.3, "sequential_3step_stateful": 46.53, "conditional_routing_stateful": 147.85, "sequential_reasoning_stateful": 59.9, "error_recovery_stateful": 39.37, "data_gap_recovery_stateful": 154.54, "data_gap_recovery_extended_stateful": 210.4, "argument_transformation_stateful": 346.92, "grounded_synthesis_stateful": 518.39, "inconsistent_api_recovery_stateful": 314.31}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 41, "data_gap_recovery_extended": 47, "argument_transformation": 50, "grounded_synthesis": 48, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 36, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 42, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [bare:full]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 44.3, "accuracy": 77.9, "completeness": 56.8, "efficiency": 100.0, "wasted": 0.3, "speed": 5.7, "n": 50, "scenarios": {"relevance_detection": 4, "argument_fidelity": 100, "tool_selection": 94, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 84, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 10, "argument_transformation": 20, "grounded_synthesis": 24, "inconsistent_api_recovery": 86, "relevance_detection_stateful": 4, "argument_fidelity_stateful": 100, "tool_selection_stateful": 94, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 84, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 2}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 2, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 42, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 5, "argument_transformation": 10, "grounded_synthesis": 12, "inconsistent_api_recovery": 43, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 42, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 14, "inconsistent_api_recovery_stateful": 1}, "scenarioCompleted": {"relevance_detection": 2, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 6, "argument_transformation": 45, "grounded_synthesis": 31, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 29, "inconsistent_api_recovery_stateful": 42}, "scenarioValidated": {"relevance_detection": 2, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 6, "argument_transformation": 45, "grounded_synthesis": 31, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 29, "inconsistent_api_recovery_stateful": 42}, "scenarioIdealCalls": {"relevance_detection": 2, "argument_fidelity": 150, "tool_selection": 141, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 168, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 40, "argument_transformation": 50, "grounded_synthesis": 120, "inconsistent_api_recovery": 344, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 150, "tool_selection_stateful": 141, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 168, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 140, "inconsistent_api_recovery_stateful": 8}, "scenarioActualCalls": {"relevance_detection": 2, "argument_fidelity": 150, "tool_selection": 141, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 192, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 31, "argument_transformation": 48, "grounded_synthesis": 80, "inconsistent_api_recovery": 355, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 150, "tool_selection_stateful": 141, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 197, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 16, "grounded_synthesis_stateful": 99, "inconsistent_api_recovery_stateful": 11}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 32.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 10.0, "grounded_synthesis": 8.0, "inconsistent_api_recovery": 40.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 35.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 15.0, "grounded_synthesis_stateful": 10.0, "inconsistent_api_recovery_stateful": 52.0}, "scenarioWastedN": {"relevance_detection": 2, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 6, "argument_transformation": 45, "grounded_synthesis": 31, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 29, "inconsistent_api_recovery_stateful": 42}, "scenarioSpeedSum": {"relevance_detection": 1.0, "argument_fidelity": 51.72, "tool_selection": 36.06, "basic_2step": 23.34, "sequential_3step": 50.5, "conditional_routing": 288.87, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 7.39, "data_gap_recovery_extended": 40.61, "argument_transformation": 679.78, "grounded_synthesis": 316.28, "inconsistent_api_recovery": 650.97, "relevance_detection_stateful": 1.3, "argument_fidelity_stateful": 50.97, "tool_selection_stateful": 38.4, "basic_2step_stateful": 27.06, "sequential_3step_stateful": 47.42, "conditional_routing_stateful": 293.93, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 14.76, "data_gap_recovery_extended_stateful": 28.76, "argument_transformation_stateful": 784.09, "grounded_synthesis_stateful": 290.81, "inconsistent_api_recovery_stateful": 504.37}, "scenarioSpeedN": {"relevance_detection": 2, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 6, "argument_transformation": 45, "grounded_synthesis": 31, "inconsistent_api_recovery": 46, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 5, "argument_transformation_stateful": 48, "grounded_synthesis_stateful": 29, "inconsistent_api_recovery_stateful": 42}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [bare:keep-last]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "keep-last", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 44.2, "accuracy": 79.3, "completeness": 55.7, "efficiency": 100.0, "wasted": 0.3, "speed": 4.8, "n": 50, "scenarios": {"relevance_detection": 10, "argument_fidelity": 100, "tool_selection": 92, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 16, "argument_transformation": 26, "grounded_synthesis": 30, "inconsistent_api_recovery": 52, "relevance_detection_stateful": 4, "argument_fidelity_stateful": 100, "tool_selection_stateful": 94, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 78, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 18, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 5, "argument_fidelity": 50, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 8, "argument_transformation": 13, "grounded_synthesis": 15, "inconsistent_api_recovery": 26, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 39, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 5, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 8, "argument_transformation": 46, "grounded_synthesis": 33, "inconsistent_api_recovery": 29, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 29}, "scenarioValidated": {"relevance_detection": 5, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 8, "argument_transformation": 46, "grounded_synthesis": 33, "inconsistent_api_recovery": 29, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 29}, "scenarioIdealCalls": {"relevance_detection": 5, "argument_fidelity": 150, "tool_selection": 138, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 64, "argument_transformation": 65, "grounded_synthesis": 150, "inconsistent_api_recovery": 208, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 150, "tool_selection_stateful": 141, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 156, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 72, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 5, "argument_fidelity": 150, "tool_selection": 138, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 221, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 47, "argument_transformation": 57, "grounded_synthesis": 86, "inconsistent_api_recovery": 237, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 150, "tool_selection_stateful": 141, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 186, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 8, "data_gap_recovery_extended_stateful": 53, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 66, "inconsistent_api_recovery_stateful": 22}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 38.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 10.0, "grounded_synthesis": 12.0, "inconsistent_api_recovery": 44.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 34.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 14.0, "grounded_synthesis_stateful": 5.0, "inconsistent_api_recovery_stateful": 42.0}, "scenarioWastedN": {"relevance_detection": 5, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 8, "argument_transformation": 46, "grounded_synthesis": 33, "inconsistent_api_recovery": 29, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 29}, "scenarioSpeedSum": {"relevance_detection": 2.19, "argument_fidelity": 51.15, "tool_selection": 36.01, "basic_2step": 22.58, "sequential_3step": 50.79, "conditional_routing": 275.85, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 4.54, "data_gap_recovery_extended": 54.19, "argument_transformation": 684.5, "grounded_synthesis": 309.98, "inconsistent_api_recovery": 264.41, "relevance_detection_stateful": 1.28, "argument_fidelity_stateful": 49.72, "tool_selection_stateful": 37.0, "basic_2step_stateful": 26.1, "sequential_3step_stateful": 49.89, "conditional_routing_stateful": 230.74, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 12.6, "data_gap_recovery_extended_stateful": 78.58, "argument_transformation_stateful": 678.39, "grounded_synthesis_stateful": 286.89, "inconsistent_api_recovery_stateful": 254.43}, "scenarioSpeedN": {"relevance_detection": 5, "argument_fidelity": 50, "tool_selection": 47, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 8, "argument_transformation": 46, "grounded_synthesis": 33, "inconsistent_api_recovery": 29, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 46, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 29}}, {"label": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/N [bare:full]", "model": "Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "mistral-small-3.2", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 42.5, "accuracy": 67.4, "completeness": 63.1, "efficiency": 100.0, "wasted": 0.0, "speed": 6.1, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 68, "sequential_reasoning": 10, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 4, "argument_transformation": 12, "grounded_synthesis": 12, "inconsistent_api_recovery": 98, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 56, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 20, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 34, "sequential_reasoning": 5, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 2, "argument_transformation": 6, "grounded_synthesis": 6, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 28, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 5, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 15, "argument_transformation": 50, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 5, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 15, "argument_transformation": 50, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 136, "sequential_reasoning": 20, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 16, "argument_transformation": 30, "grounded_synthesis": 60, "inconsistent_api_recovery": 392, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 112, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 136, "sequential_reasoning": 20, "error_recovery": 0, "data_gap_recovery": 8, "data_gap_recovery_extended": 9, "argument_transformation": 23, "grounded_synthesis": 34, "inconsistent_api_recovery": 249, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 112, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 51, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 1.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 5, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 15, "argument_transformation": 50, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 88.58, "tool_selection": 57.49, "basic_2step": 40.15, "sequential_3step": 86.77, "conditional_routing": 400.39, "sequential_reasoning": 23.68, "error_recovery": 0.0, "data_gap_recovery": 23.93, "data_gap_recovery_extended": 142.31, "argument_transformation": 712.42, "grounded_synthesis": 511.37, "inconsistent_api_recovery": 369.29, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 88.87, "tool_selection_stateful": 57.66, "basic_2step_stateful": 48.83, "sequential_3step_stateful": 84.93, "conditional_routing_stateful": 369.64, "sequential_reasoning_stateful": 50.28, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 10.55, "data_gap_recovery_extended_stateful": 140.11, "argument_transformation_stateful": 686.06, "grounded_synthesis_stateful": 632.49, "inconsistent_api_recovery_stateful": 341.32}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 5, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 15, "argument_transformation": 50, "grounded_synthesis": 34, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 49}}, {"label": "Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [bare]", "model": "Ministral-3-8B-Reasoning-2512-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "none", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 43.5, "accuracy": 80.3, "completeness": 54.2, "efficiency": 100.0, "wasted": 0.3, "speed": 3.8, "n": 50, "scenarios": {"relevance_detection": 2, "argument_fidelity": 100, "tool_selection": 98, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 86, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 14, "argument_transformation": 12, "grounded_synthesis": 14, "inconsistent_api_recovery": 72, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 100, "tool_selection_stateful": 98, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 86, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 22, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 1, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 7, "argument_transformation": 6, "grounded_synthesis": 7, "inconsistent_api_recovery": 36, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 43, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 11, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 1, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 8, "argument_transformation": 31, "grounded_synthesis": 33, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 35}, "scenarioValidated": {"relevance_detection": 1, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 8, "argument_transformation": 31, "grounded_synthesis": 33, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 35}, "scenarioIdealCalls": {"relevance_detection": 1, "argument_fidelity": 150, "tool_selection": 147, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 172, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 56, "argument_transformation": 30, "grounded_synthesis": 70, "inconsistent_api_recovery": 288, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 150, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 172, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 64, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 110, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 1, "argument_fidelity": 150, "tool_selection": 147, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 203, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 44, "argument_transformation": 37, "grounded_synthesis": 41, "inconsistent_api_recovery": 313, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 150, "tool_selection_stateful": 147, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 198, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 12, "data_gap_recovery_extended_stateful": 40, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 84, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 36.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 15.0, "grounded_synthesis": 8.0, "inconsistent_api_recovery": 48.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 33.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 11.0, "grounded_synthesis_stateful": 14.0, "inconsistent_api_recovery_stateful": 44.0}, "scenarioWastedN": {"relevance_detection": 1, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 8, "argument_transformation": 31, "grounded_synthesis": 33, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 35}, "scenarioSpeedSum": {"relevance_detection": 0.88, "argument_fidelity": 51.19, "tool_selection": 40.17, "basic_2step": 22.55, "sequential_3step": 52.74, "conditional_routing": 195.38, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 3.95, "data_gap_recovery_extended": 50.77, "argument_transformation": 423.31, "grounded_synthesis": 283.94, "inconsistent_api_recovery": 236.31, "relevance_detection_stateful": 0.35, "argument_fidelity_stateful": 49.52, "tool_selection_stateful": 38.31, "basic_2step_stateful": 26.05, "sequential_3step_stateful": 51.72, "conditional_routing_stateful": 197.23, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 12.91, "data_gap_recovery_extended_stateful": 58.47, "argument_transformation_stateful": 396.92, "grounded_synthesis_stateful": 251.19, "inconsistent_api_recovery_stateful": 231.43}, "scenarioSpeedN": {"relevance_detection": 1, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 43, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 8, "argument_transformation": 31, "grounded_synthesis": 33, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 31, "inconsistent_api_recovery_stateful": 35}}, {"label": "granite-4.1-8b-Q8_0 LS/P [bare]", "model": "granite-4.1-8b-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "none", "family": "granite-4.1-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 42.3, "accuracy": 50.0, "completeness": 84.6, "efficiency": 86.0, "wasted": 0.4, "speed": 2.4, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 100, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 500, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 695, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 695, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 196.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 196.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 63.79, "tool_selection": 36.57, "basic_2step": 32.08, "sequential_3step": 63.15, "conditional_routing": 148.19, "sequential_reasoning": 81.06, "error_recovery": 0.0, "data_gap_recovery": 61.79, "data_gap_recovery_extended": 102.12, "argument_transformation": 323.14, "grounded_synthesis": 283.35, "inconsistent_api_recovery": 119.62, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 64.05, "tool_selection_stateful": 36.52, "basic_2step_stateful": 35.33, "sequential_3step_stateful": 67.15, "conditional_routing_stateful": 147.54, "sequential_reasoning_stateful": 78.21, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 63.01, "data_gap_recovery_extended_stateful": 102.07, "argument_transformation_stateful": 304.45, "grounded_synthesis_stateful": 283.19, "inconsistent_api_recovery_stateful": 122.78}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.0-h-micro-Q4_K_M LS/N [bare:full]", "model": "granite-4.0-h-micro-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "granite-4.0-h-micro", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 42.2, "accuracy": 55.0, "completeness": 76.8, "efficiency": 100.0, "wasted": 0.2, "speed": 3.4, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 100, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 100, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 100, "data_gap_recovery_extended_stateful": 98, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 100, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 250, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 500, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 250, "data_gap_recovery_extended_stateful": 392, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 500, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 200, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 450, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 200, "data_gap_recovery_extended_stateful": 294, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 548, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 100.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 49.0, "inconsistent_api_recovery_stateful": 100.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 75.19, "tool_selection": 0.0, "basic_2step": 43.55, "sequential_3step": 68.11, "conditional_routing": 219.17, "sequential_reasoning": 90.8, "error_recovery": 0.0, "data_gap_recovery": 157.3, "data_gap_recovery_extended": 220.95, "argument_transformation": 134.97, "grounded_synthesis": 269.28, "inconsistent_api_recovery": 354.72, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 75.14, "tool_selection_stateful": 0.0, "basic_2step_stateful": 52.01, "sequential_3step_stateful": 68.92, "conditional_routing_stateful": 151.42, "sequential_reasoning_stateful": 88.71, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 157.11, "data_gap_recovery_extended_stateful": 325.68, "argument_transformation_stateful": 134.3, "grounded_synthesis_stateful": 333.48, "inconsistent_api_recovery_stateful": 359.22}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 49, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [bare:full]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 41.8, "accuracy": 72.7, "completeness": 57.5, "efficiency": 99.7, "wasted": 0.3, "speed": 3.7, "n": 50, "scenarios": {"relevance_detection": 6, "argument_fidelity": 100, "tool_selection": 84, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 70, "sequential_reasoning": 2, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 12, "argument_transformation": 8, "grounded_synthesis": 12, "inconsistent_api_recovery": 90, "relevance_detection_stateful": 14, "argument_fidelity_stateful": 100, "tool_selection_stateful": 82, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 66, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 18, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 2}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 42, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 35, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 6, "argument_transformation": 4, "grounded_synthesis": 6, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 41, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 33, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 1}, "scenarioCompleted": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 42, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 36, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 7, "argument_transformation": 48, "grounded_synthesis": 37, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 42, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 45}, "scenarioValidated": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 42, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 36, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 7, "argument_transformation": 48, "grounded_synthesis": 37, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 42, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 45}, "scenarioIdealCalls": {"relevance_detection": 3, "argument_fidelity": 150, "tool_selection": 126, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 140, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 48, "argument_transformation": 20, "grounded_synthesis": 60, "inconsistent_api_recovery": 360, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 150, "tool_selection_stateful": 123, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 132, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 72, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 80, "inconsistent_api_recovery_stateful": 8}, "scenarioActualCalls": {"relevance_detection": 3, "argument_fidelity": 150, "tool_selection": 126, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 160, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 33, "argument_transformation": 18, "grounded_synthesis": 36, "inconsistent_api_recovery": 399, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 150, "tool_selection_stateful": 123, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 159, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 60, "inconsistent_api_recovery_stateful": 12}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 27.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 8.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 56.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 29.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 5.0, "argument_transformation_stateful": 14.0, "grounded_synthesis_stateful": 8.0, "inconsistent_api_recovery_stateful": 62.0}, "scenarioWastedN": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 42, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 36, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 7, "argument_transformation": 48, "grounded_synthesis": 37, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 42, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 45}, "scenarioSpeedSum": {"relevance_detection": 1.49, "argument_fidelity": 32.73, "tool_selection": 20.9, "basic_2step": 14.63, "sequential_3step": 27.47, "conditional_routing": 180.21, "sequential_reasoning": 2.19, "error_recovery": 0.0, "data_gap_recovery": 2.72, "data_gap_recovery_extended": 43.08, "argument_transformation": 470.98, "grounded_synthesis": 215.92, "inconsistent_api_recovery": 360.11, "relevance_detection_stateful": 2.73, "argument_fidelity_stateful": 33.26, "tool_selection_stateful": 21.42, "basic_2step_stateful": 17.06, "sequential_3step_stateful": 27.59, "conditional_routing_stateful": 188.44, "sequential_reasoning_stateful": 4.44, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 79.46, "argument_transformation_stateful": 410.37, "grounded_synthesis_stateful": 243.09, "inconsistent_api_recovery_stateful": 360.3}, "scenarioSpeedN": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 42, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 36, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 7, "argument_transformation": 48, "grounded_synthesis": 37, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 42, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 38, "inconsistent_api_recovery_stateful": 45}}, {"label": "qwen3:8b-q4_K_M OL/N [bare:full]", "model": "qwen3:8b-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "qwen3-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 40.7, "accuracy": 53.0, "completeness": 76.8, "efficiency": 96.2, "wasted": 0.1, "speed": 15.8, "n": 50, "scenarios": {"relevance_detection": 56, "argument_fidelity": 98, "tool_selection": 2, "basic_2step": 4, "sequential_3step": 100, "conditional_routing": 94, "sequential_reasoning": 100, "error_recovery": 2, "data_gap_recovery": 38, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 0, "inconsistent_api_recovery": 12, "relevance_detection_stateful": 68, "argument_fidelity_stateful": 100, "tool_selection_stateful": 6, "basic_2step_stateful": 78, "sequential_3step_stateful": 98, "conditional_routing_stateful": 74, "sequential_reasoning_stateful": 96, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 28, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 28, "argument_fidelity": 49, "tool_selection": 1, "basic_2step": 2, "sequential_3step": 50, "conditional_routing": 47, "sequential_reasoning": 50, "error_recovery": 1, "data_gap_recovery": 19, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 0, "inconsistent_api_recovery": 6, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 39, "sequential_3step_stateful": 49, "conditional_routing_stateful": 37, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 14, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 28, "argument_fidelity": 49, "tool_selection": 1, "basic_2step": 2, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 41, "data_gap_recovery": 28, "data_gap_recovery_extended": 42, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 39, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 28, "argument_fidelity": 49, "tool_selection": 1, "basic_2step": 2, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 41, "data_gap_recovery": 28, "data_gap_recovery_extended": 42, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 39, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 28, "argument_fidelity": 147, "tool_selection": 3, "basic_2step": 4, "sequential_3step": 150, "conditional_routing": 188, "sequential_reasoning": 200, "error_recovery": 2, "data_gap_recovery": 95, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 0, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 150, "tool_selection_stateful": 9, "basic_2step_stateful": 78, "sequential_3step_stateful": 147, "conditional_routing_stateful": 148, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 70, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 28, "argument_fidelity": 147, "tool_selection": 3, "basic_2step": 4, "sequential_3step": 150, "conditional_routing": 222, "sequential_reasoning": 200, "error_recovery": 1, "data_gap_recovery": 98, "data_gap_recovery_extended": 0, "argument_transformation": 8, "grounded_synthesis": 0, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 150, "tool_selection_stateful": 9, "basic_2step_stateful": 78, "sequential_3step_stateful": 147, "conditional_routing_stateful": 185, "sequential_reasoning_stateful": 192, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 68, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 38.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 4.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 1.0, "inconsistent_api_recovery": 2.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 37.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 1.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 28, "argument_fidelity": 49, "tool_selection": 1, "basic_2step": 2, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 41, "data_gap_recovery": 28, "data_gap_recovery_extended": 42, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 39, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 88.7, "argument_fidelity": 263.29, "tool_selection": 5.57, "basic_2step": 8.35, "sequential_3step": 409.89, "conditional_routing": 897.67, "sequential_reasoning": 350.9, "error_recovery": 138.66, "data_gap_recovery": 384.11, "data_gap_recovery_extended": 800.46, "argument_transformation": 1672.38, "grounded_synthesis": 1694.51, "inconsistent_api_recovery": 1023.5, "relevance_detection_stateful": 105.93, "argument_fidelity_stateful": 301.7, "tool_selection_stateful": 20.16, "basic_2step_stateful": 140.21, "sequential_3step_stateful": 438.09, "conditional_routing_stateful": 960.04, "sequential_reasoning_stateful": 414.75, "error_recovery_stateful": 145.6, "data_gap_recovery_stateful": 319.34, "data_gap_recovery_extended_stateful": 846.06, "argument_transformation_stateful": 1965.91, "grounded_synthesis_stateful": 1540.69, "inconsistent_api_recovery_stateful": 880.64}, "scenarioSpeedN": {"relevance_detection": 28, "argument_fidelity": 49, "tool_selection": 1, "basic_2step": 2, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 41, "data_gap_recovery": 28, "data_gap_recovery_extended": 42, "argument_transformation": 40, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 34, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 39, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 42, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 44, "grounded_synthesis_stateful": 47, "inconsistent_api_recovery_stateful": 49}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [bare:keep-last]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "keep-last", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 41.5, "accuracy": 77.2, "completeness": 53.7, "efficiency": 100.0, "wasted": 0.3, "speed": 5.5, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 52, "basic_2step": 64, "sequential_3step": 94, "conditional_routing": 68, "sequential_reasoning": 2, "error_recovery": 0, "data_gap_recovery": 12, "data_gap_recovery_extended": 4, "argument_transformation": 34, "grounded_synthesis": 40, "inconsistent_api_recovery": 82, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 64, "basic_2step_stateful": 90, "sequential_3step_stateful": 100, "conditional_routing_stateful": 82, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 24, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 16, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 14}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 26, "basic_2step": 32, "sequential_3step": 47, "conditional_routing": 34, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 6, "data_gap_recovery_extended": 2, "argument_transformation": 17, "grounded_synthesis": 20, "inconsistent_api_recovery": 41, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 32, "basic_2step_stateful": 45, "sequential_3step_stateful": 50, "conditional_routing_stateful": 41, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 12, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 15, "inconsistent_api_recovery_stateful": 7}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 26, "basic_2step": 32, "sequential_3step": 47, "conditional_routing": 36, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 8, "data_gap_recovery_extended": 17, "argument_transformation": 47, "grounded_synthesis": 25, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 32, "basic_2step_stateful": 45, "sequential_3step_stateful": 50, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 18, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 26, "basic_2step": 32, "sequential_3step": 47, "conditional_routing": 36, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 8, "data_gap_recovery_extended": 17, "argument_transformation": 47, "grounded_synthesis": 25, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 32, "basic_2step_stateful": 45, "sequential_3step_stateful": 50, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 18, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 78, "basic_2step": 64, "sequential_3step": 141, "conditional_routing": 136, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 30, "data_gap_recovery_extended": 16, "argument_transformation": 85, "grounded_synthesis": 200, "inconsistent_api_recovery": 328, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 96, "basic_2step_stateful": 90, "sequential_3step_stateful": 150, "conditional_routing_stateful": 164, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 60, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 40, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 56}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 78, "basic_2step": 64, "sequential_3step": 141, "conditional_routing": 135, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 28, "data_gap_recovery_extended": 11, "argument_transformation": 79, "grounded_synthesis": 104, "inconsistent_api_recovery": 391, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 96, "basic_2step_stateful": 90, "sequential_3step_stateful": 150, "conditional_routing_stateful": 170, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 56, "data_gap_recovery_extended_stateful": 13, "argument_transformation_stateful": 37, "grounded_synthesis_stateful": 74, "inconsistent_api_recovery_stateful": 81}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 15.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 1.0, "argument_transformation": 7.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 83.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 22.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 3.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 71.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 26, "basic_2step": 32, "sequential_3step": 47, "conditional_routing": 36, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 8, "data_gap_recovery_extended": 17, "argument_transformation": 47, "grounded_synthesis": 25, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 32, "basic_2step_stateful": 45, "sequential_3step_stateful": 50, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 18, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 49.02, "tool_selection": 18.86, "basic_2step": 14.99, "sequential_3step": 64.66, "conditional_routing": 227.76, "sequential_reasoning": 4.74, "error_recovery": 0.0, "data_gap_recovery": 42.73, "data_gap_recovery_extended": 130.39, "argument_transformation": 647.08, "grounded_synthesis": 390.44, "inconsistent_api_recovery": 311.02, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 50.12, "tool_selection_stateful": 23.47, "basic_2step_stateful": 23.2, "sequential_3step_stateful": 55.45, "conditional_routing_stateful": 272.78, "sequential_reasoning_stateful": 2.09, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 90.87, "data_gap_recovery_extended_stateful": 80.65, "argument_transformation_stateful": 579.04, "grounded_synthesis_stateful": 360.13, "inconsistent_api_recovery_stateful": 369.46}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 26, "basic_2step": 32, "sequential_3step": 47, "conditional_routing": 36, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 8, "data_gap_recovery_extended": 17, "argument_transformation": 47, "grounded_synthesis": 25, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 32, "basic_2step_stateful": 45, "sequential_3step_stateful": 50, "conditional_routing_stateful": 44, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 18, "data_gap_recovery_extended_stateful": 11, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [bare:keep-last]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "keep-last", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 41.2, "accuracy": 77.3, "completeness": 53.3, "efficiency": 100.0, "wasted": 0.2, "speed": 3.4, "n": 50, "scenarios": {"relevance_detection": 6, "argument_fidelity": 100, "tool_selection": 82, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 78, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 10, "argument_transformation": 20, "grounded_synthesis": 20, "inconsistent_api_recovery": 58, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 100, "tool_selection_stateful": 90, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 70, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 6, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 41, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 39, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 5, "argument_transformation": 10, "grounded_synthesis": 10, "inconsistent_api_recovery": 29, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 50, "tool_selection_stateful": 45, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 35, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 42, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 39, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 7, "argument_transformation": 44, "grounded_synthesis": 37, "inconsistent_api_recovery": 30, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 50, "tool_selection_stateful": 45, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 29, "inconsistent_api_recovery_stateful": 34}, "scenarioValidated": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 42, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 39, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 7, "argument_transformation": 44, "grounded_synthesis": 37, "inconsistent_api_recovery": 30, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 50, "tool_selection_stateful": 45, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 29, "inconsistent_api_recovery_stateful": 34}, "scenarioIdealCalls": {"relevance_detection": 3, "argument_fidelity": 150, "tool_selection": 123, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 156, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 40, "argument_transformation": 50, "grounded_synthesis": 100, "inconsistent_api_recovery": 232, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 150, "tool_selection_stateful": 135, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 140, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 3, "argument_fidelity": 150, "tool_selection": 123, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 181, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 8, "data_gap_recovery_extended": 18, "argument_transformation": 47, "grounded_synthesis": 60, "inconsistent_api_recovery": 244, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 150, "tool_selection_stateful": 135, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 159, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 7, "data_gap_recovery_extended_stateful": 19, "argument_transformation_stateful": 9, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 32.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 17.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 32.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 27.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 10.0, "grounded_synthesis_stateful": 8.0, "inconsistent_api_recovery_stateful": 45.0}, "scenarioWastedN": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 42, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 39, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 7, "argument_transformation": 44, "grounded_synthesis": 37, "inconsistent_api_recovery": 30, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 50, "tool_selection_stateful": 45, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 29, "inconsistent_api_recovery_stateful": 34}, "scenarioSpeedSum": {"relevance_detection": 1.37, "argument_fidelity": 34.66, "tool_selection": 21.64, "basic_2step": 15.59, "sequential_3step": 30.49, "conditional_routing": 194.11, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 6.27, "data_gap_recovery_extended": 36.66, "argument_transformation": 452.09, "grounded_synthesis": 248.37, "inconsistent_api_recovery": 218.97, "relevance_detection_stateful": 2.35, "argument_fidelity_stateful": 35.62, "tool_selection_stateful": 24.01, "basic_2step_stateful": 17.58, "sequential_3step_stateful": 29.7, "conditional_routing_stateful": 203.17, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 9.59, "data_gap_recovery_extended_stateful": 12.31, "argument_transformation_stateful": 329.06, "grounded_synthesis_stateful": 209.38, "inconsistent_api_recovery_stateful": 220.21}, "scenarioSpeedN": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 42, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 39, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 7, "argument_transformation": 44, "grounded_synthesis": 37, "inconsistent_api_recovery": 30, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 50, "tool_selection_stateful": 45, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 29, "inconsistent_api_recovery_stateful": 34}}, {"label": "Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [bare]", "model": "Ministral-3-8B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "none", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 40.2, "accuracy": 77.0, "completeness": 52.2, "efficiency": 98.1, "wasted": 0.3, "speed": 2.8, "n": 50, "scenarios": {"relevance_detection": 6, "argument_fidelity": 100, "tool_selection": 84, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 68, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 6, "argument_transformation": 18, "grounded_synthesis": 8, "inconsistent_api_recovery": 62, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 100, "tool_selection_stateful": 86, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 70, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 42, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 34, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 3, "argument_transformation": 9, "grounded_synthesis": 4, "inconsistent_api_recovery": 31, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 35, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 6, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 43, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 34, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 7, "argument_transformation": 28, "grounded_synthesis": 30, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 45}, "scenarioValidated": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 43, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 34, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 7, "argument_transformation": 28, "grounded_synthesis": 30, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 45}, "scenarioIdealCalls": {"relevance_detection": 3, "argument_fidelity": 150, "tool_selection": 126, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 136, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 24, "argument_transformation": 45, "grounded_synthesis": 40, "inconsistent_api_recovery": 248, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 150, "tool_selection_stateful": 129, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 140, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 48, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 3, "argument_fidelity": 150, "tool_selection": 126, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 151, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 9, "data_gap_recovery_extended": 16, "argument_transformation": 37, "grounded_synthesis": 24, "inconsistent_api_recovery": 310, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 150, "tool_selection_stateful": 129, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 160, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 23.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 62.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 27.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 4.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 77.0}, "scenarioWastedN": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 43, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 34, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 7, "argument_transformation": 28, "grounded_synthesis": 30, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 45}, "scenarioSpeedSum": {"relevance_detection": 1.08, "argument_fidelity": 34.47, "tool_selection": 22.14, "basic_2step": 15.1, "sequential_3step": 28.01, "conditional_routing": 171.57, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 6.06, "data_gap_recovery_extended": 30.69, "argument_transformation": 252.58, "grounded_synthesis": 179.75, "inconsistent_api_recovery": 137.08, "relevance_detection_stateful": 2.58, "argument_fidelity_stateful": 35.11, "tool_selection_stateful": 21.21, "basic_2step_stateful": 17.25, "sequential_3step_stateful": 31.07, "conditional_routing_stateful": 179.68, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 6.85, "data_gap_recovery_extended_stateful": 33.72, "argument_transformation_stateful": 297.07, "grounded_synthesis_stateful": 180.07, "inconsistent_api_recovery_stateful": 192.33}, "scenarioSpeedN": {"relevance_detection": 3, "argument_fidelity": 50, "tool_selection": 43, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 34, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 7, "argument_transformation": 28, "grounded_synthesis": 30, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 50, "tool_selection_stateful": 43, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 36, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 33, "grounded_synthesis_stateful": 28, "inconsistent_api_recovery_stateful": 45}}, {"label": "Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [bare]", "model": "Ministral-3-14B-Reasoning-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "none", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 40.2, "accuracy": 81.0, "completeness": 49.7, "efficiency": 100.0, "wasted": 0.4, "speed": 4.5, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 50, "basic_2step": 76, "sequential_3step": 94, "conditional_routing": 70, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 14, "data_gap_recovery_extended": 6, "argument_transformation": 28, "grounded_synthesis": 32, "inconsistent_api_recovery": 86, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 52, "basic_2step_stateful": 92, "sequential_3step_stateful": 94, "conditional_routing_stateful": 78, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 16, "grounded_synthesis_stateful": 24, "inconsistent_api_recovery_stateful": 20}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 25, "basic_2step": 38, "sequential_3step": 47, "conditional_routing": 35, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 3, "argument_transformation": 14, "grounded_synthesis": 16, "inconsistent_api_recovery": 43, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 26, "basic_2step_stateful": 46, "sequential_3step_stateful": 47, "conditional_routing_stateful": 39, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 10}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 25, "basic_2step": 38, "sequential_3step": 47, "conditional_routing": 38, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 8, "argument_transformation": 36, "grounded_synthesis": 27, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 26, "basic_2step_stateful": 46, "sequential_3step_stateful": 47, "conditional_routing_stateful": 39, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 7, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 38, "grounded_synthesis_stateful": 19, "inconsistent_api_recovery_stateful": 47}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 25, "basic_2step": 38, "sequential_3step": 47, "conditional_routing": 38, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 8, "argument_transformation": 36, "grounded_synthesis": 27, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 26, "basic_2step_stateful": 46, "sequential_3step_stateful": 47, "conditional_routing_stateful": 39, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 7, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 38, "grounded_synthesis_stateful": 19, "inconsistent_api_recovery_stateful": 47}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 75, "basic_2step": 76, "sequential_3step": 141, "conditional_routing": 140, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 35, "data_gap_recovery_extended": 24, "argument_transformation": 70, "grounded_synthesis": 160, "inconsistent_api_recovery": 344, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 78, "basic_2step_stateful": 92, "sequential_3step_stateful": 141, "conditional_routing_stateful": 156, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 25, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 40, "grounded_synthesis_stateful": 120, "inconsistent_api_recovery_stateful": 80}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 75, "basic_2step": 76, "sequential_3step": 141, "conditional_routing": 134, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 21, "argument_transformation": 71, "grounded_synthesis": 85, "inconsistent_api_recovery": 407, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 78, "basic_2step_stateful": 92, "sequential_3step_stateful": 141, "conditional_routing_stateful": 150, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 20, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 41, "grounded_synthesis_stateful": 62, "inconsistent_api_recovery_stateful": 120}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 14.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 3.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 7.0, "grounded_synthesis": 2.0, "inconsistent_api_recovery": 90.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 16.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 13.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 95.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 25, "basic_2step": 38, "sequential_3step": 47, "conditional_routing": 38, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 8, "argument_transformation": 36, "grounded_synthesis": 27, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 26, "basic_2step_stateful": 46, "sequential_3step_stateful": 47, "conditional_routing_stateful": 39, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 7, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 38, "grounded_synthesis_stateful": 19, "inconsistent_api_recovery_stateful": 47}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 49.04, "tool_selection": 18.04, "basic_2step": 17.16, "sequential_3step": 46.21, "conditional_routing": 199.49, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 31.61, "data_gap_recovery_extended": 43.4, "argument_transformation": 455.31, "grounded_synthesis": 432.05, "inconsistent_api_recovery": 251.76, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 48.65, "tool_selection_stateful": 18.2, "basic_2step_stateful": 23.98, "sequential_3step_stateful": 52.62, "conditional_routing_stateful": 178.22, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 24.42, "data_gap_recovery_extended_stateful": 23.02, "argument_transformation_stateful": 459.4, "grounded_synthesis_stateful": 276.87, "inconsistent_api_recovery_stateful": 268.07}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 25, "basic_2step": 38, "sequential_3step": 47, "conditional_routing": 38, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 8, "argument_transformation": 36, "grounded_synthesis": 27, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 26, "basic_2step_stateful": 46, "sequential_3step_stateful": 47, "conditional_routing_stateful": 39, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 7, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 38, "grounded_synthesis_stateful": 19, "inconsistent_api_recovery_stateful": 47}}, {"label": "granite4.1:8b-q4_K_M OL/N [bare:full]", "model": "granite4.1:8b-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "granite-4.1-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 38.6, "accuracy": 50.2, "completeness": 76.9, "efficiency": 94.1, "wasted": 1.0, "speed": 2.1, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 250, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 7, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 50.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 400.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 50.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 2.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 400.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 50.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 49.75, "tool_selection": 0.0, "basic_2step": 27.15, "sequential_3step": 41.51, "conditional_routing": 106.74, "sequential_reasoning": 54.63, "error_recovery": 0.0, "data_gap_recovery": 125.98, "data_gap_recovery_extended": 155.92, "argument_transformation": 188.18, "grounded_synthesis": 172.18, "inconsistent_api_recovery": 147.02, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 49.76, "tool_selection_stateful": 0.0, "basic_2step_stateful": 23.04, "sequential_3step_stateful": 41.54, "conditional_routing_stateful": 97.21, "sequential_reasoning_stateful": 54.64, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 126.88, "data_gap_recovery_extended_stateful": 155.97, "argument_transformation_stateful": 188.41, "grounded_synthesis_stateful": 172.28, "inconsistent_api_recovery_stateful": 147.11}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q4_K_M LS/N [bare:full]", "model": "Meta-Llama-3.1-8B-Instruct-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 38.2, "accuracy": 45.3, "completeness": 84.5, "efficiency": 95.9, "wasted": 0.6, "speed": 1.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 2, "conditional_routing": 58, "sequential_reasoning": 56, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 4, "inconsistent_api_recovery": 28, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 98, "tool_selection_stateful": 54, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 52, "sequential_reasoning_stateful": 32, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 1, "conditional_routing": 29, "sequential_reasoning": 28, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 14, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 49, "tool_selection_stateful": 27, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 26, "sequential_reasoning_stateful": 16, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 3, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 28, "conditional_routing": 47, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 41, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 28, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 28, "conditional_routing": 47, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 41, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 28, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 3, "conditional_routing": 116, "sequential_reasoning": 112, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 20, "inconsistent_api_recovery": 112, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 81, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 104, "sequential_reasoning_stateful": 64, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 133, "basic_2step": 100, "sequential_3step": 3, "conditional_routing": 142, "sequential_reasoning": 83, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 24, "inconsistent_api_recovery": 151, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 147, "tool_selection_stateful": 81, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 130, "sequential_reasoning_stateful": 77, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 12, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 13, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 31.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 18.0, "grounded_synthesis": 160.0, "inconsistent_api_recovery": 77.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 18.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 32.0, "sequential_reasoning_stateful": 22.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 8.0, "grounded_synthesis_stateful": 220.0, "inconsistent_api_recovery_stateful": 78.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 28, "conditional_routing": 47, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 41, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 28, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 10.35, "argument_fidelity": 40.98, "tool_selection": 34.09, "basic_2step": 18.11, "sequential_3step": 7.08, "conditional_routing": 76.01, "sequential_reasoning": 41.78, "error_recovery": 0.0, "data_gap_recovery": 19.12, "data_gap_recovery_extended": 19.66, "argument_transformation": 47.61, "grounded_synthesis": 159.89, "inconsistent_api_recovery": 161.92, "relevance_detection_stateful": 10.61, "argument_fidelity_stateful": 43.52, "tool_selection_stateful": 38.5, "basic_2step_stateful": 20.24, "sequential_3step_stateful": 6.34, "conditional_routing_stateful": 72.96, "sequential_reasoning_stateful": 70.2, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 26.38, "data_gap_recovery_extended_stateful": 17.01, "argument_transformation_stateful": 29.77, "grounded_synthesis_stateful": 184.4, "inconsistent_api_recovery_stateful": 169.59}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 28, "conditional_routing": 47, "sequential_reasoning": 46, "error_recovery": 0, "data_gap_recovery": 48, "data_gap_recovery_extended": 49, "argument_transformation": 49, "grounded_synthesis": 41, "inconsistent_api_recovery": 47, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 39, "basic_2step_stateful": 50, "sequential_3step_stateful": 28, "conditional_routing_stateful": 45, "sequential_reasoning_stateful": 43, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 43, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 49}}, {"label": "granite-4.0-h-tiny-Q4_K_M LS/N [reforged:full]", "model": "granite-4.0-h-tiny-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "granite-4.0-h-tiny", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 38.5, "accuracy": 45.7, "completeness": 84.2, "efficiency": 75.0, "wasted": 2.6, "speed": 3.8, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 100, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 100, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 201, "basic_2step": 200, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 300, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 200, "basic_2step_stateful": 200, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 300, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 51.0, "basic_2step": 100.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 100.0, "error_recovery": 0.0, "data_gap_recovery": 250.0, "data_gap_recovery_extended": 96.0, "argument_transformation": 452.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 250.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 50.0, "basic_2step_stateful": 100.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 100.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 250.0, "data_gap_recovery_extended_stateful": 92.0, "argument_transformation_stateful": 452.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 250.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 55.6, "tool_selection": 78.52, "basic_2step": 50.32, "sequential_3step": 46.96, "conditional_routing": 138.26, "sequential_reasoning": 156.9, "error_recovery": 0.0, "data_gap_recovery": 352.05, "data_gap_recovery_extended": 328.53, "argument_transformation": 260.35, "grounded_synthesis": 300.07, "inconsistent_api_recovery": 323.81, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 55.87, "tool_selection_stateful": 78.23, "basic_2step_stateful": 54.1, "sequential_3step_stateful": 47.07, "conditional_routing_stateful": 140.14, "sequential_reasoning_stateful": 154.63, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 360.19, "data_gap_recovery_extended_stateful": 317.45, "argument_transformation_stateful": 263.49, "grounded_synthesis_stateful": 298.47, "inconsistent_api_recovery_stateful": 322.8}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.0:h-micro-q4_K_M OL/N [reforged:full]", "model": "granite-4.0:h-micro-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "granite-4.0-h-micro", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 38.4, "accuracy": 55.6, "completeness": 69.1, "efficiency": 81.7, "wasted": 3.2, "speed": 7.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 98, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 245, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 392, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 300, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 196.0, "data_gap_recovery": 148.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 343.0, "grounded_synthesis": 400.0, "inconsistent_api_recovery": 348.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 100.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 147.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 4.0, "argument_transformation_stateful": 343.0, "grounded_synthesis_stateful": 400.0, "inconsistent_api_recovery_stateful": 348.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 66.86, "argument_fidelity": 98.78, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 79.72, "conditional_routing": 0.0, "sequential_reasoning": 122.43, "error_recovery": 342.05, "data_gap_recovery": 825.62, "data_gap_recovery_extended": 0.0, "argument_transformation": 401.08, "grounded_synthesis": 675.93, "inconsistent_api_recovery": 538.42, "relevance_detection_stateful": 66.75, "argument_fidelity_stateful": 97.65, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 79.72, "conditional_routing_stateful": 949.28, "sequential_reasoning_stateful": 120.57, "error_recovery_stateful": 337.54, "data_gap_recovery_stateful": 7.27, "data_gap_recovery_extended_stateful": 26.3, "argument_transformation_stateful": 403.09, "grounded_synthesis_stateful": 675.91, "inconsistent_api_recovery_stateful": 539.73}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 49, "data_gap_recovery": 50, "data_gap_recovery_extended": 0, "argument_transformation": 49, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q8_0 LS/N [bare:full]", "model": "Meta-Llama-3.1-8B-Instruct-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 36.5, "accuracy": 46.2, "completeness": 79.1, "efficiency": 96.2, "wasted": 0.8, "speed": 1.8, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 98, "tool_selection": 100, "basic_2step": 78, "sequential_3step": 4, "conditional_routing": 28, "sequential_reasoning": 62, "error_recovery": 0, "data_gap_recovery": 6, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 40, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 66, "basic_2step_stateful": 88, "sequential_3step_stateful": 0, "conditional_routing_stateful": 28, "sequential_reasoning_stateful": 44, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 49, "tool_selection": 50, "basic_2step": 39, "sequential_3step": 2, "conditional_routing": 14, "sequential_reasoning": 31, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 20, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 33, "basic_2step_stateful": 44, "sequential_3step_stateful": 0, "conditional_routing_stateful": 14, "sequential_reasoning_stateful": 22, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 10, "conditional_routing": 27, "sequential_reasoning": 42, "error_recovery": 12, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 36, "grounded_synthesis": 49, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 10, "conditional_routing_stateful": 29, "sequential_reasoning_stateful": 41, "error_recovery_stateful": 5, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 45}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 10, "conditional_routing": 27, "sequential_reasoning": 42, "error_recovery": 12, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 36, "grounded_synthesis": 49, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 10, "conditional_routing_stateful": 29, "sequential_reasoning_stateful": 41, "error_recovery_stateful": 5, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 45}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 147, "tool_selection": 150, "basic_2step": 78, "sequential_3step": 6, "conditional_routing": 56, "sequential_reasoning": 124, "error_recovery": 0, "data_gap_recovery": 15, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 160, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 99, "basic_2step_stateful": 88, "sequential_3step_stateful": 0, "conditional_routing_stateful": 56, "sequential_reasoning_stateful": 88, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 147, "tool_selection": 140, "basic_2step": 77, "sequential_3step": 6, "conditional_routing": 70, "sequential_reasoning": 94, "error_recovery": 0, "data_gap_recovery": 11, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 219, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 102, "basic_2step_stateful": 88, "sequential_3step_stateful": 0, "conditional_routing_stateful": 70, "sequential_reasoning_stateful": 99, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 9, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 16.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 133.0, "grounded_synthesis": 122.0, "inconsistent_api_recovery": 102.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 21.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 16.0, "sequential_reasoning_stateful": 18.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 66.0, "grounded_synthesis_stateful": 197.0, "inconsistent_api_recovery_stateful": 108.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 10, "conditional_routing": 27, "sequential_reasoning": 42, "error_recovery": 12, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 36, "grounded_synthesis": 49, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 10, "conditional_routing_stateful": 29, "sequential_reasoning_stateful": 41, "error_recovery_stateful": 5, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 45}, "scenarioSpeedSum": {"relevance_detection": 15.85, "argument_fidelity": 63.02, "tool_selection": 56.58, "basic_2step": 24.36, "sequential_3step": 5.54, "conditional_routing": 43.82, "sequential_reasoning": 60.86, "error_recovery": 3.31, "data_gap_recovery": 34.35, "data_gap_recovery_extended": 32.87, "argument_transformation": 103.09, "grounded_synthesis": 233.37, "inconsistent_api_recovery": 251.73, "relevance_detection_stateful": 16.15, "argument_fidelity_stateful": 64.36, "tool_selection_stateful": 63.02, "basic_2step_stateful": 28.95, "sequential_3step_stateful": 5.07, "conditional_routing_stateful": 42.62, "sequential_reasoning_stateful": 87.37, "error_recovery_stateful": 1.39, "data_gap_recovery_stateful": 36.72, "data_gap_recovery_extended_stateful": 33.12, "argument_transformation_stateful": 57.68, "grounded_synthesis_stateful": 263.53, "inconsistent_api_recovery_stateful": 254.25}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 10, "conditional_routing": 27, "sequential_reasoning": 42, "error_recovery": 12, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 36, "grounded_synthesis": 49, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 10, "conditional_routing_stateful": 29, "sequential_reasoning_stateful": 41, "error_recovery_stateful": 5, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 29, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 45}}, {"label": "Nemotron-3-Nano-30B-A3B-Q4_K_M LS/N [bare:full]", "model": "Nemotron-3-Nano-30B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "nemotron-3-nano", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 37.5, "accuracy": 85.7, "completeness": 43.7, "efficiency": 94.5, "wasted": 0.2, "speed": 6.6, "n": 50, "scenarios": {"relevance_detection": 32, "argument_fidelity": 100, "tool_selection": 88, "basic_2step": 74, "sequential_3step": 86, "conditional_routing": 70, "sequential_reasoning": 6, "error_recovery": 0, "data_gap_recovery": 20, "data_gap_recovery_extended": 10, "argument_transformation": 0, "grounded_synthesis": 14, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 14, "argument_fidelity_stateful": 100, "tool_selection_stateful": 94, "basic_2step_stateful": 72, "sequential_3step_stateful": 92, "conditional_routing_stateful": 56, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 18, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 16, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 37, "sequential_3step": 43, "conditional_routing": 35, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 5, "argument_transformation": 0, "grounded_synthesis": 7, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 36, "sequential_3step_stateful": 46, "conditional_routing_stateful": 28, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 9, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 16, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 37, "sequential_3step": 43, "conditional_routing": 47, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 12, "argument_transformation": 0, "grounded_synthesis": 22, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 37, "sequential_3step_stateful": 46, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 9, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 1}, "scenarioValidated": {"relevance_detection": 16, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 37, "sequential_3step": 43, "conditional_routing": 47, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 12, "argument_transformation": 0, "grounded_synthesis": 22, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 37, "sequential_3step_stateful": 46, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 9, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 1}, "scenarioIdealCalls": {"relevance_detection": 16, "argument_fidelity": 150, "tool_selection": 132, "basic_2step": 74, "sequential_3step": 129, "conditional_routing": 140, "sequential_reasoning": 12, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 40, "argument_transformation": 0, "grounded_synthesis": 70, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 150, "tool_selection_stateful": 141, "basic_2step_stateful": 72, "sequential_3step_stateful": 138, "conditional_routing_stateful": 112, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 56, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 16, "argument_fidelity": 150, "tool_selection": 132, "basic_2step": 74, "sequential_3step": 129, "conditional_routing": 154, "sequential_reasoning": 12, "error_recovery": 0, "data_gap_recovery": 52, "data_gap_recovery_extended": 37, "argument_transformation": 0, "grounded_synthesis": 100, "inconsistent_api_recovery": 18, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 150, "tool_selection_stateful": 141, "basic_2step_stateful": 72, "sequential_3step_stateful": 138, "conditional_routing_stateful": 140, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 51, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 74, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 19.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 44.0, "inconsistent_api_recovery": 2.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 28.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 44.0, "inconsistent_api_recovery_stateful": 1.0}, "scenarioWastedN": {"relevance_detection": 16, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 37, "sequential_3step": 43, "conditional_routing": 47, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 12, "argument_transformation": 0, "grounded_synthesis": 22, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 37, "sequential_3step_stateful": 46, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 9, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 1}, "scenarioSpeedSum": {"relevance_detection": 24.23, "argument_fidelity": 189.73, "tool_selection": 145.03, "basic_2step": 73.72, "sequential_3step": 174.26, "conditional_routing": 429.97, "sequential_reasoning": 39.61, "error_recovery": 0.0, "data_gap_recovery": 139.28, "data_gap_recovery_extended": 180.47, "argument_transformation": 0.0, "grounded_synthesis": 424.52, "inconsistent_api_recovery": 49.47, "relevance_detection_stateful": 9.56, "argument_fidelity_stateful": 177.04, "tool_selection_stateful": 144.92, "basic_2step_stateful": 85.47, "sequential_3step_stateful": 175.16, "conditional_routing_stateful": 419.76, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 107.89, "data_gap_recovery_extended_stateful": 186.55, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 563.2, "inconsistent_api_recovery_stateful": 29.92}, "scenarioSpeedN": {"relevance_detection": 16, "argument_fidelity": 50, "tool_selection": 44, "basic_2step": 37, "sequential_3step": 43, "conditional_routing": 47, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 12, "argument_transformation": 0, "grounded_synthesis": 22, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 7, "argument_fidelity_stateful": 50, "tool_selection_stateful": 47, "basic_2step_stateful": 37, "sequential_3step_stateful": 46, "conditional_routing_stateful": 47, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 9, "data_gap_recovery_extended_stateful": 12, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 25, "inconsistent_api_recovery_stateful": 1}}, {"label": "Meta-Llama-3.1-8B-Instruct.Q4_K_M LF/P [bare:full]", "model": "Meta-Llama-3.1-8B-Instruct.Q4_K_M", "backend": "llamafile", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 36.1, "accuracy": 42.6, "completeness": 84.6, "efficiency": 98.6, "wasted": 0.4, "speed": 2.2, "n": 50, "scenarios": {"relevance_detection": 92, "argument_fidelity": 98, "tool_selection": 60, "basic_2step": 100, "sequential_3step": 42, "conditional_routing": 62, "sequential_reasoning": 10, "error_recovery": 0, "data_gap_recovery": 18, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 90, "argument_fidelity_stateful": 96, "tool_selection_stateful": 60, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 62, "sequential_reasoning_stateful": 8, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 12, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 46, "argument_fidelity": 49, "tool_selection": 30, "basic_2step": 50, "sequential_3step": 21, "conditional_routing": 31, "sequential_reasoning": 5, "error_recovery": 0, "data_gap_recovery": 9, "data_gap_recovery_extended": 4, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 48, "tool_selection_stateful": 30, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 31, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 11, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 11, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 46, "argument_fidelity": 147, "tool_selection": 90, "basic_2step": 100, "sequential_3step": 63, "conditional_routing": 124, "sequential_reasoning": 20, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 32, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 144, "tool_selection_stateful": 90, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 124, "sequential_reasoning_stateful": 16, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 46, "argument_fidelity": 148, "tool_selection": 79, "basic_2step": 100, "sequential_3step": 63, "conditional_routing": 131, "sequential_reasoning": 20, "error_recovery": 0, "data_gap_recovery": 34, "data_gap_recovery_extended": 22, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 146, "tool_selection_stateful": 90, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 137, "sequential_reasoning_stateful": 19, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 26, "data_gap_recovery_extended_stateful": 33, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 26}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 1.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 22.0, "sequential_reasoning": 16.0, "error_recovery": 0.0, "data_gap_recovery": 23.0, "data_gap_recovery_extended": 19.0, "argument_transformation": 31.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 18.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 2.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 85.0, "conditional_routing_stateful": 28.0, "sequential_reasoning_stateful": 72.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 29.0, "data_gap_recovery_extended_stateful": 27.0, "argument_transformation_stateful": 69.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 23.0}, "scenarioWastedN": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 11, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 19.36, "argument_fidelity": 63.56, "tool_selection": 53.96, "basic_2step": 38.56, "sequential_3step": 50.24, "conditional_routing": 109.39, "sequential_reasoning": 84.99, "error_recovery": 0.0, "data_gap_recovery": 109.28, "data_gap_recovery_extended": 146.13, "argument_transformation": 71.24, "grounded_synthesis": 243.69, "inconsistent_api_recovery": 166.1, "relevance_detection_stateful": 18.7, "argument_fidelity_stateful": 62.5, "tool_selection_stateful": 58.52, "basic_2step_stateful": 44.97, "sequential_3step_stateful": 73.02, "conditional_routing_stateful": 106.4, "sequential_reasoning_stateful": 107.61, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 102.97, "data_gap_recovery_extended_stateful": 154.78, "argument_transformation_stateful": 130.27, "grounded_synthesis_stateful": 243.07, "inconsistent_api_recovery_stateful": 166.6}, "scenarioSpeedN": {"relevance_detection": 46, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 47, "argument_transformation": 11, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 45, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 47, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 12, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.0:h-tiny-q4_K_M OL/N [reforged:full]", "model": "granite-4.0:h-tiny-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "granite-4.0-h-tiny", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 35.5, "accuracy": 50.9, "completeness": 69.8, "efficiency": 78.9, "wasted": 1.4, "speed": 4.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 4, "basic_2step": 100, "sequential_3step": 100, "conditional_routing": 6, "sequential_reasoning": 100, "error_recovery": 100, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 6, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 100, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 3, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 4, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 18, "data_gap_recovery_extended": 8, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 18, "data_gap_recovery_extended": 8, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 6, "basic_2step": 100, "sequential_3step": 150, "conditional_routing": 12, "sequential_reasoning": 200, "error_recovery": 100, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 9, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 150, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 9, "basic_2step": 200, "sequential_3step": 150, "conditional_routing": 18, "sequential_reasoning": 250, "error_recovery": 200, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 13, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 25, "sequential_reasoning_stateful": 250, "error_recovery_stateful": 200, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 3.0, "basic_2step": 100.0, "sequential_3step": 0.0, "conditional_routing": 6.0, "sequential_reasoning": 50.0, "error_recovery": 100.0, "data_gap_recovery": 47.0, "data_gap_recovery_extended": 4.0, "argument_transformation": 0.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 217.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 4.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 9.0, "sequential_reasoning_stateful": 50.0, "error_recovery_stateful": 50.0, "data_gap_recovery_stateful": 51.0, "data_gap_recovery_extended_stateful": 10.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 295.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 18, "data_gap_recovery_extended": 8, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 86.85, "tool_selection": 6.59, "basic_2step": 108.12, "sequential_3step": 80.18, "conditional_routing": 165.18, "sequential_reasoning": 156.15, "error_recovery": 115.15, "data_gap_recovery": 144.2, "data_gap_recovery_extended": 68.78, "argument_transformation": 133.34, "grounded_synthesis": 345.22, "inconsistent_api_recovery": 389.7, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 86.69, "tool_selection_stateful": 9.64, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 80.8, "conditional_routing_stateful": 177.47, "sequential_reasoning_stateful": 155.68, "error_recovery_stateful": 116.07, "data_gap_recovery_stateful": 150.95, "data_gap_recovery_extended_stateful": 88.53, "argument_transformation_stateful": 129.85, "grounded_synthesis_stateful": 345.05, "inconsistent_api_recovery_stateful": 491.27}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 2, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 18, "data_gap_recovery_extended": 8, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 3, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 10, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "mistral-nemo:12b-instruct-2407-q4_K_M OL/N [reforged:full]", "model": "mistral-nemo:12b-instruct-2407-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 34.2, "accuracy": 44.9, "completeness": 76.0, "efficiency": 48.7, "wasted": 3.4, "speed": 7.9, "n": 50, "scenarios": {"relevance_detection": 46, "argument_fidelity": 14, "tool_selection": 48, "basic_2step": 98, "sequential_3step": 28, "conditional_routing": 34, "sequential_reasoning": 50, "error_recovery": 44, "data_gap_recovery": 82, "data_gap_recovery_extended": 12, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 22, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 8, "tool_selection_stateful": 56, "basic_2step_stateful": 100, "sequential_3step_stateful": 28, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 64, "error_recovery_stateful": 34, "data_gap_recovery_stateful": 78, "data_gap_recovery_extended_stateful": 6, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 23, "argument_fidelity": 7, "tool_selection": 24, "basic_2step": 49, "sequential_3step": 14, "conditional_routing": 17, "sequential_reasoning": 25, "error_recovery": 22, "data_gap_recovery": 41, "data_gap_recovery_extended": 6, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 11, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 4, "tool_selection_stateful": 28, "basic_2step_stateful": 50, "sequential_3step_stateful": 14, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 32, "error_recovery_stateful": 17, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 3, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 23, "argument_fidelity": 27, "tool_selection": 31, "basic_2step": 49, "sequential_3step": 37, "conditional_routing": 21, "sequential_reasoning": 39, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 46, "argument_transformation": 32, "grounded_synthesis": 40, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 28, "tool_selection_stateful": 40, "basic_2step_stateful": 50, "sequential_3step_stateful": 38, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 42}, "scenarioValidated": {"relevance_detection": 23, "argument_fidelity": 27, "tool_selection": 31, "basic_2step": 49, "sequential_3step": 37, "conditional_routing": 21, "sequential_reasoning": 39, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 46, "argument_transformation": 32, "grounded_synthesis": 40, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 28, "tool_selection_stateful": 40, "basic_2step_stateful": 50, "sequential_3step_stateful": 38, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 42}, "scenarioIdealCalls": {"relevance_detection": 23, "argument_fidelity": 21, "tool_selection": 72, "basic_2step": 98, "sequential_3step": 42, "conditional_routing": 68, "sequential_reasoning": 100, "error_recovery": 44, "data_gap_recovery": 205, "data_gap_recovery_extended": 48, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 88, "relevance_detection_stateful": 1, "argument_fidelity_stateful": 12, "tool_selection_stateful": 84, "basic_2step_stateful": 100, "sequential_3step_stateful": 42, "conditional_routing_stateful": 64, "sequential_reasoning_stateful": 128, "error_recovery_stateful": 51, "data_gap_recovery_stateful": 195, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 100, "argument_fidelity": 47, "tool_selection": 162, "basic_2step": 221, "sequential_3step": 103, "conditional_routing": 184, "sequential_reasoning": 194, "error_recovery": 113, "data_gap_recovery": 349, "data_gap_recovery_extended": 89, "argument_transformation": 0, "grounded_synthesis": 19, "inconsistent_api_recovery": 167, "relevance_detection_stateful": 2, "argument_fidelity_stateful": 26, "tool_selection_stateful": 198, "basic_2step_stateful": 172, "sequential_3step_stateful": 107, "conditional_routing_stateful": 189, "sequential_reasoning_stateful": 233, "error_recovery_stateful": 87, "data_gap_recovery_stateful": 321, "data_gap_recovery_extended_stateful": 37, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 77.0, "argument_fidelity": 151.0, "tool_selection": 109.0, "basic_2step": 123.0, "sequential_3step": 154.0, "conditional_routing": 140.0, "sequential_reasoning": 140.0, "error_recovery": 160.0, "data_gap_recovery": 166.0, "data_gap_recovery_extended": 192.0, "argument_transformation": 28.0, "grounded_synthesis": 57.0, "inconsistent_api_recovery": 282.0, "relevance_detection_stateful": 87.0, "argument_fidelity_stateful": 151.0, "tool_selection_stateful": 142.0, "basic_2step_stateful": 72.0, "sequential_3step_stateful": 151.0, "conditional_routing_stateful": 164.0, "sequential_reasoning_stateful": 132.0, "error_recovery_stateful": 119.0, "data_gap_recovery_stateful": 141.0, "data_gap_recovery_extended_stateful": 160.0, "argument_transformation_stateful": 50.0, "grounded_synthesis_stateful": 41.0, "inconsistent_api_recovery_stateful": 217.0}, "scenarioWastedN": {"relevance_detection": 23, "argument_fidelity": 27, "tool_selection": 31, "basic_2step": 49, "sequential_3step": 37, "conditional_routing": 21, "sequential_reasoning": 39, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 46, "argument_transformation": 32, "grounded_synthesis": 40, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 28, "tool_selection_stateful": 40, "basic_2step_stateful": 50, "sequential_3step_stateful": 38, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 42}, "scenarioSpeedSum": {"relevance_detection": 56.93, "argument_fidelity": 117.8, "tool_selection": 182.16, "basic_2step": 93.28, "sequential_3step": 320.16, "conditional_routing": 314.28, "sequential_reasoning": 242.33, "error_recovery": 239.99, "data_gap_recovery": 433.57, "data_gap_recovery_extended": 582.79, "argument_transformation": 218.44, "grounded_synthesis": 668.38, "inconsistent_api_recovery": 481.14, "relevance_detection_stateful": 70.58, "argument_fidelity_stateful": 126.07, "tool_selection_stateful": 216.67, "basic_2step_stateful": 81.48, "sequential_3step_stateful": 276.19, "conditional_routing_stateful": 317.28, "sequential_reasoning_stateful": 213.79, "error_recovery_stateful": 287.49, "data_gap_recovery_stateful": 402.25, "data_gap_recovery_extended_stateful": 483.4, "argument_transformation_stateful": 252.0, "grounded_synthesis_stateful": 679.32, "inconsistent_api_recovery_stateful": 424.44}, "scenarioSpeedN": {"relevance_detection": 23, "argument_fidelity": 27, "tool_selection": 31, "basic_2step": 49, "sequential_3step": 37, "conditional_routing": 21, "sequential_reasoning": 39, "error_recovery": 50, "data_gap_recovery": 48, "data_gap_recovery_extended": 46, "argument_transformation": 32, "grounded_synthesis": 40, "inconsistent_api_recovery": 45, "relevance_detection_stateful": 29, "argument_fidelity_stateful": 28, "tool_selection_stateful": 40, "basic_2step_stateful": 50, "sequential_3step_stateful": 38, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 40, "error_recovery_stateful": 49, "data_gap_recovery_stateful": 47, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 44, "inconsistent_api_recovery_stateful": 42}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q4_K_M LS/P [bare:full]", "model": "Meta-Llama-3.1-8B-Instruct-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 33.0, "accuracy": 38.5, "completeness": 85.7, "efficiency": 100.0, "wasted": 0.2, "speed": 1.3, "n": 50, "scenarios": {"relevance_detection": 68, "argument_fidelity": 80, "tool_selection": 82, "basic_2step": 100, "sequential_3step": 22, "conditional_routing": 50, "sequential_reasoning": 8, "error_recovery": 0, "data_gap_recovery": 24, "data_gap_recovery_extended": 20, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 64, "argument_fidelity_stateful": 84, "tool_selection_stateful": 80, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 34, "argument_fidelity": 40, "tool_selection": 41, "basic_2step": 50, "sequential_3step": 11, "conditional_routing": 25, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 12, "data_gap_recovery_extended": 10, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 1, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 42, "tool_selection_stateful": 40, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 24, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 8, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 34, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 34, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 34, "argument_fidelity": 120, "tool_selection": 123, "basic_2step": 100, "sequential_3step": 33, "conditional_routing": 100, "sequential_reasoning": 16, "error_recovery": 0, "data_gap_recovery": 60, "data_gap_recovery_extended": 80, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 126, "tool_selection_stateful": 120, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 96, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 40, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 34, "argument_fidelity": 120, "tool_selection": 96, "basic_2step": 100, "sequential_3step": 33, "conditional_routing": 119, "sequential_reasoning": 10, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 39, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 3, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 126, "tool_selection_stateful": 125, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 120, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 38, "data_gap_recovery_extended_stateful": 24, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 24.0, "sequential_reasoning": 20.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 5.0, "argument_transformation": 12.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 5.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 3.0, "conditional_routing_stateful": 25.0, "sequential_reasoning_stateful": 55.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 20.0, "data_gap_recovery_extended_stateful": 6.0, "argument_transformation_stateful": 20.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 34, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 6.38, "argument_fidelity": 32.01, "tool_selection": 27.74, "basic_2step": 18.04, "sequential_3step": 20.88, "conditional_routing": 51.23, "sequential_reasoning": 42.64, "error_recovery": 0.0, "data_gap_recovery": 49.28, "data_gap_recovery_extended": 84.08, "argument_transformation": 167.19, "grounded_synthesis": 145.91, "inconsistent_api_recovery": 78.02, "relevance_detection_stateful": 5.82, "argument_fidelity_stateful": 29.06, "tool_selection_stateful": 33.82, "basic_2step_stateful": 20.15, "sequential_3step_stateful": 18.59, "conditional_routing_stateful": 46.63, "sequential_reasoning_stateful": 53.54, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 69.39, "data_gap_recovery_extended_stateful": 82.25, "argument_transformation_stateful": 115.27, "grounded_synthesis_stateful": 143.85, "inconsistent_api_recovery_stateful": 80.45}, "scenarioSpeedN": {"relevance_detection": 34, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 44, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 32, "argument_fidelity_stateful": 50, "tool_selection_stateful": 49, "basic_2step_stateful": 50, "sequential_3step_stateful": 46, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 46, "argument_transformation_stateful": 34, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Instruct-2512-Q8_0 LS/N [bare]", "model": "Ministral-3-8B-Instruct-2512-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "none", "family": "ministral-8b", "quant": "q8_0", "gen": 3, "retired": false, "score": 33.0, "accuracy": 62.2, "completeness": 53.1, "efficiency": 100.0, "wasted": 0.0, "speed": 4.4, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 68, "data_gap_recovery_extended": 0, "argument_transformation": 2, "grounded_synthesis": 2, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 78, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 34, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 1, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 34, "data_gap_recovery_extended": 8, "argument_transformation": 47, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 13, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 34, "data_gap_recovery_extended": 8, "argument_transformation": 47, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 13, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 170, "data_gap_recovery_extended": 0, "argument_transformation": 5, "grounded_synthesis": 10, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 195, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 150, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 136, "data_gap_recovery_extended": 0, "argument_transformation": 3, "grounded_synthesis": 5, "inconsistent_api_recovery": 250, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 150, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 156, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 34, "data_gap_recovery_extended": 8, "argument_transformation": 47, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 13, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 59.75, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 37.53, "conditional_routing": 142.05, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 183.14, "data_gap_recovery_extended": 42.64, "argument_transformation": 519.07, "grounded_synthesis": 298.01, "inconsistent_api_recovery": 189.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 59.74, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 37.53, "conditional_routing_stateful": 167.98, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 211.63, "data_gap_recovery_extended_stateful": 76.51, "argument_transformation_stateful": 545.73, "grounded_synthesis_stateful": 302.34, "inconsistent_api_recovery_stateful": 188.49}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 34, "data_gap_recovery_extended": 8, "argument_transformation": 47, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 39, "data_gap_recovery_extended_stateful": 13, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Meta-Llama-3.1-8B-Instruct-Q8_0 LS/P [bare:full]", "model": "Meta-Llama-3.1-8B-Instruct-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 31.5, "accuracy": 36.3, "completeness": 86.8, "efficiency": 100.0, "wasted": 0.2, "speed": 2.1, "n": 50, "scenarios": {"relevance_detection": 78, "argument_fidelity": 94, "tool_selection": 72, "basic_2step": 100, "sequential_3step": 6, "conditional_routing": 22, "sequential_reasoning": 2, "error_recovery": 0, "data_gap_recovery": 24, "data_gap_recovery_extended": 12, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 92, "argument_fidelity_stateful": 86, "tool_selection_stateful": 66, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 26, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 39, "argument_fidelity": 47, "tool_selection": 36, "basic_2step": 50, "sequential_3step": 3, "conditional_routing": 11, "sequential_reasoning": 1, "error_recovery": 0, "data_gap_recovery": 12, "data_gap_recovery_extended": 6, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 43, "tool_selection_stateful": 33, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 13, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 39, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 44, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 39, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 44, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 39, "argument_fidelity": 141, "tool_selection": 108, "basic_2step": 100, "sequential_3step": 9, "conditional_routing": 44, "sequential_reasoning": 4, "error_recovery": 0, "data_gap_recovery": 60, "data_gap_recovery_extended": 48, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 129, "tool_selection_stateful": 99, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 52, "sequential_reasoning_stateful": 8, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 55, "data_gap_recovery_extended_stateful": 56, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 39, "argument_fidelity": 141, "tool_selection": 75, "basic_2step": 93, "sequential_3step": 8, "conditional_routing": 56, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 38, "data_gap_recovery_extended": 30, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 130, "tool_selection_stateful": 103, "basic_2step_stateful": 92, "sequential_3step_stateful": 0, "conditional_routing_stateful": 65, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 41, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 12.0, "sequential_reasoning": 17.0, "error_recovery": 0.0, "data_gap_recovery": 1.0, "data_gap_recovery_extended": 8.0, "argument_transformation": 27.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 3.0, "tool_selection_stateful": 5.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 10.0, "conditional_routing_stateful": 13.0, "sequential_reasoning_stateful": 39.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 20.0, "data_gap_recovery_extended_stateful": 9.0, "argument_transformation_stateful": 37.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 39, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 44, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 10.81, "argument_fidelity": 68.01, "tool_selection": 48.61, "basic_2step": 27.89, "sequential_3step": 30.97, "conditional_routing": 90.13, "sequential_reasoning": 93.01, "error_recovery": 0.0, "data_gap_recovery": 84.15, "data_gap_recovery_extended": 118.23, "argument_transformation": 239.6, "grounded_synthesis": 208.57, "inconsistent_api_recovery": 123.12, "relevance_detection_stateful": 12.87, "argument_fidelity_stateful": 51.28, "tool_selection_stateful": 58.2, "basic_2step_stateful": 29.58, "sequential_3step_stateful": 33.61, "conditional_routing_stateful": 66.16, "sequential_reasoning_stateful": 81.45, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 104.99, "data_gap_recovery_extended_stateful": 170.91, "argument_transformation_stateful": 295.76, "grounded_synthesis_stateful": 209.04, "inconsistent_api_recovery_stateful": 129.5}, "scenarioSpeedN": {"relevance_detection": 39, "argument_fidelity": 50, "tool_selection": 49, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 41, "data_gap_recovery_extended": 44, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 46, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 48, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 43, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 32, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "ministral-3:14b-instruct-2512-q4_K_M OL/N [bare:full]", "model": "ministral-3:14b-instruct-2512-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "ministral-14b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 32.0, "accuracy": 84.4, "completeness": 37.9, "efficiency": 100.0, "wasted": 0.1, "speed": 3.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 100, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 30, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 100, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 15, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 15, "grounded_synthesis": 6, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 16}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 15, "grounded_synthesis": 6, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 16}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 120, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 150, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 150, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 5, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 28.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 40.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 15, "grounded_synthesis": 6, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 16}, "scenarioSpeedSum": {"relevance_detection": 22.68, "argument_fidelity": 72.87, "tool_selection": 45.81, "basic_2step": 0.0, "sequential_3step": 92.07, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 219.68, "grounded_synthesis": 90.02, "inconsistent_api_recovery": 137.73, "relevance_detection_stateful": 22.95, "argument_fidelity_stateful": 72.74, "tool_selection_stateful": 45.7, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 117.96, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 4.92, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 306.66, "grounded_synthesis_stateful": 272.89, "inconsistent_api_recovery_stateful": 119.49}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 15, "grounded_synthesis": 6, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 16}}, {"label": "Qwen3-14B-Q4_K_M LS/N [bare:keep-last]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "keep-last", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 30.8, "accuracy": 52.1, "completeness": 59.2, "efficiency": 100.0, "wasted": 0.0, "speed": 18.3, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 28, "tool_selection": 6, "basic_2step": 24, "sequential_3step": 60, "conditional_routing": 72, "sequential_reasoning": 34, "error_recovery": 0, "data_gap_recovery": 12, "data_gap_recovery_extended": 16, "argument_transformation": 4, "grounded_synthesis": 34, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 12, "tool_selection_stateful": 8, "basic_2step_stateful": 46, "sequential_3step_stateful": 52, "conditional_routing_stateful": 60, "sequential_reasoning_stateful": 46, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 24, "data_gap_recovery_extended_stateful": 28, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 14, "tool_selection": 3, "basic_2step": 12, "sequential_3step": 30, "conditional_routing": 36, "sequential_reasoning": 17, "error_recovery": 0, "data_gap_recovery": 6, "data_gap_recovery_extended": 8, "argument_transformation": 2, "grounded_synthesis": 17, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 6, "tool_selection_stateful": 4, "basic_2step_stateful": 23, "sequential_3step_stateful": 26, "conditional_routing_stateful": 30, "sequential_reasoning_stateful": 23, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 12, "data_gap_recovery_extended_stateful": 14, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 15, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 14, "tool_selection": 7, "basic_2step": 12, "sequential_3step": 47, "conditional_routing": 38, "sequential_reasoning": 17, "error_recovery": 0, "data_gap_recovery": 28, "data_gap_recovery_extended": 40, "argument_transformation": 32, "grounded_synthesis": 43, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 6, "tool_selection_stateful": 7, "basic_2step_stateful": 24, "sequential_3step_stateful": 47, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 23, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 34, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 14, "tool_selection": 7, "basic_2step": 12, "sequential_3step": 47, "conditional_routing": 38, "sequential_reasoning": 17, "error_recovery": 0, "data_gap_recovery": 28, "data_gap_recovery_extended": 40, "argument_transformation": 32, "grounded_synthesis": 43, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 6, "tool_selection_stateful": 7, "basic_2step_stateful": 24, "sequential_3step_stateful": 47, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 23, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 34, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 42, "tool_selection": 9, "basic_2step": 24, "sequential_3step": 90, "conditional_routing": 144, "sequential_reasoning": 68, "error_recovery": 0, "data_gap_recovery": 30, "data_gap_recovery_extended": 64, "argument_transformation": 10, "grounded_synthesis": 170, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 18, "tool_selection_stateful": 12, "basic_2step_stateful": 46, "sequential_3step_stateful": 78, "conditional_routing_stateful": 120, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 60, "data_gap_recovery_extended_stateful": 112, "argument_transformation_stateful": 15, "grounded_synthesis_stateful": 150, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 42, "tool_selection": 9, "basic_2step": 15, "sequential_3step": 90, "conditional_routing": 126, "sequential_reasoning": 68, "error_recovery": 0, "data_gap_recovery": 24, "data_gap_recovery_extended": 27, "argument_transformation": 10, "grounded_synthesis": 65, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 18, "tool_selection_stateful": 12, "basic_2step_stateful": 40, "sequential_3step_stateful": 78, "conditional_routing_stateful": 90, "sequential_reasoning_stateful": 92, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 46, "data_gap_recovery_extended_stateful": 54, "argument_transformation_stateful": 14, "grounded_synthesis_stateful": 70, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 13.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 9.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 1.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 14, "tool_selection": 7, "basic_2step": 12, "sequential_3step": 47, "conditional_routing": 38, "sequential_reasoning": 17, "error_recovery": 0, "data_gap_recovery": 28, "data_gap_recovery_extended": 40, "argument_transformation": 32, "grounded_synthesis": 43, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 6, "tool_selection_stateful": 7, "basic_2step_stateful": 24, "sequential_3step_stateful": 47, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 23, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 34, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 97.97, "argument_fidelity": 89.67, "tool_selection": 31.21, "basic_2step": 30.51, "sequential_3step": 388.95, "conditional_routing": 701.11, "sequential_reasoning": 298.49, "error_recovery": 0.0, "data_gap_recovery": 413.5, "data_gap_recovery_extended": 803.01, "argument_transformation": 1789.96, "grounded_synthesis": 1530.57, "inconsistent_api_recovery": 744.37, "relevance_detection_stateful": 95.54, "argument_fidelity_stateful": 38.27, "tool_selection_stateful": 32.85, "basic_2step_stateful": 73.44, "sequential_3step_stateful": 361.84, "conditional_routing_stateful": 674.6, "sequential_reasoning_stateful": 403.77, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 517.17, "data_gap_recovery_extended_stateful": 789.57, "argument_transformation_stateful": 1609.16, "grounded_synthesis_stateful": 1727.42, "inconsistent_api_recovery_stateful": 821.88}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 14, "tool_selection": 7, "basic_2step": 12, "sequential_3step": 47, "conditional_routing": 38, "sequential_reasoning": 17, "error_recovery": 0, "data_gap_recovery": 28, "data_gap_recovery_extended": 40, "argument_transformation": 32, "grounded_synthesis": 43, "inconsistent_api_recovery": 49, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 6, "tool_selection_stateful": 7, "basic_2step_stateful": 24, "sequential_3step_stateful": 47, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 23, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 34, "data_gap_recovery_extended_stateful": 39, "argument_transformation_stateful": 30, "grounded_synthesis_stateful": 42, "inconsistent_api_recovery_stateful": 50}}, {"label": "Meta-Llama-3.1-8B-Instruct.Q8_0 LF/P [bare:full]", "model": "Meta-Llama-3.1-8B-Instruct.Q8_0", "backend": "llamafile", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 29.8, "accuracy": 34.6, "completeness": 86.3, "efficiency": 100.0, "wasted": 0.2, "speed": 3.1, "n": 50, "scenarios": {"relevance_detection": 98, "argument_fidelity": 82, "tool_selection": 22, "basic_2step": 100, "sequential_3step": 12, "conditional_routing": 48, "sequential_reasoning": 6, "error_recovery": 0, "data_gap_recovery": 28, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 72, "tool_selection_stateful": 14, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 30, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 49, "argument_fidelity": 41, "tool_selection": 11, "basic_2step": 50, "sequential_3step": 6, "conditional_routing": 24, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 14, "data_gap_recovery_extended": 4, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 36, "tool_selection_stateful": 7, "basic_2step_stateful": 50, "sequential_3step_stateful": 0, "conditional_routing_stateful": 23, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 15, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 47, "argument_transformation": 23, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 47, "argument_transformation": 23, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 49, "argument_fidelity": 123, "tool_selection": 33, "basic_2step": 100, "sequential_3step": 18, "conditional_routing": 96, "sequential_reasoning": 12, "error_recovery": 0, "data_gap_recovery": 70, "data_gap_recovery_extended": 32, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 108, "tool_selection_stateful": 21, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 92, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 75, "data_gap_recovery_extended_stateful": 32, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 49, "argument_fidelity": 123, "tool_selection": 30, "basic_2step": 100, "sequential_3step": 18, "conditional_routing": 114, "sequential_reasoning": 8, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 34, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 109, "tool_selection_stateful": 21, "basic_2step_stateful": 100, "sequential_3step_stateful": 0, "conditional_routing_stateful": 112, "sequential_reasoning_stateful": 6, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 54, "data_gap_recovery_extended_stateful": 26, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 20.0, "sequential_reasoning": 4.0, "error_recovery": 0.0, "data_gap_recovery": 2.0, "data_gap_recovery_extended": 9.0, "argument_transformation": 88.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 1.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 6.0, "conditional_routing_stateful": 22.0, "sequential_reasoning_stateful": 39.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 1.0, "data_gap_recovery_extended_stateful": 9.0, "argument_transformation_stateful": 60.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 47, "argument_transformation": 23, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 25.18, "argument_fidelity": 82.94, "tool_selection": 59.55, "basic_2step": 50.54, "sequential_3step": 56.69, "conditional_routing": 122.34, "sequential_reasoning": 109.92, "error_recovery": 0.0, "data_gap_recovery": 140.71, "data_gap_recovery_extended": 206.29, "argument_transformation": 358.24, "grounded_synthesis": 349.71, "inconsistent_api_recovery": 199.46, "relevance_detection_stateful": 26.05, "argument_fidelity_stateful": 85.06, "tool_selection_stateful": 60.87, "basic_2step_stateful": 56.08, "sequential_3step_stateful": 56.3, "conditional_routing_stateful": 125.12, "sequential_reasoning_stateful": 140.72, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 138.89, "data_gap_recovery_extended_stateful": 204.91, "argument_transformation_stateful": 242.59, "grounded_synthesis_stateful": 337.19, "inconsistent_api_recovery_stateful": 199.35}, "scenarioSpeedN": {"relevance_detection": 49, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 48, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 45, "data_gap_recovery_extended": 47, "argument_transformation": 23, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 48, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 45, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 20, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3-14B-Q4_K_M LS/N [bare:full]", "model": "Qwen3-14B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "qwen3-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 27.5, "accuracy": 48.5, "completeness": 56.8, "efficiency": 100.0, "wasted": 0.0, "speed": 16.5, "n": 50, "scenarios": {"relevance_detection": 96, "argument_fidelity": 16, "tool_selection": 4, "basic_2step": 24, "sequential_3step": 48, "conditional_routing": 62, "sequential_reasoning": 36, "error_recovery": 0, "data_gap_recovery": 22, "data_gap_recovery_extended": 4, "argument_transformation": 6, "grounded_synthesis": 28, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 18, "tool_selection_stateful": 6, "basic_2step_stateful": 52, "sequential_3step_stateful": 62, "conditional_routing_stateful": 60, "sequential_reasoning_stateful": 18, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 14, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 4, "grounded_synthesis_stateful": 32, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 48, "argument_fidelity": 8, "tool_selection": 2, "basic_2step": 12, "sequential_3step": 24, "conditional_routing": 31, "sequential_reasoning": 18, "error_recovery": 0, "data_gap_recovery": 11, "data_gap_recovery_extended": 2, "argument_transformation": 3, "grounded_synthesis": 14, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 9, "tool_selection_stateful": 3, "basic_2step_stateful": 26, "sequential_3step_stateful": 31, "conditional_routing_stateful": 30, "sequential_reasoning_stateful": 9, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 7, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 16, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 48, "argument_fidelity": 8, "tool_selection": 5, "basic_2step": 12, "sequential_3step": 42, "conditional_routing": 36, "sequential_reasoning": 19, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 41, "argument_transformation": 29, "grounded_synthesis": 44, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 9, "tool_selection_stateful": 6, "basic_2step_stateful": 28, "sequential_3step_stateful": 46, "conditional_routing_stateful": 41, "sequential_reasoning_stateful": 9, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 23, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 26, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 49}, "scenarioValidated": {"relevance_detection": 48, "argument_fidelity": 8, "tool_selection": 5, "basic_2step": 12, "sequential_3step": 42, "conditional_routing": 36, "sequential_reasoning": 19, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 41, "argument_transformation": 29, "grounded_synthesis": 44, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 9, "tool_selection_stateful": 6, "basic_2step_stateful": 28, "sequential_3step_stateful": 46, "conditional_routing_stateful": 41, "sequential_reasoning_stateful": 9, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 23, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 26, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 49}, "scenarioIdealCalls": {"relevance_detection": 48, "argument_fidelity": 24, "tool_selection": 6, "basic_2step": 24, "sequential_3step": 72, "conditional_routing": 124, "sequential_reasoning": 72, "error_recovery": 0, "data_gap_recovery": 55, "data_gap_recovery_extended": 16, "argument_transformation": 15, "grounded_synthesis": 140, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 27, "tool_selection_stateful": 9, "basic_2step_stateful": 52, "sequential_3step_stateful": 93, "conditional_routing_stateful": 120, "sequential_reasoning_stateful": 36, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 35, "data_gap_recovery_extended_stateful": 16, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 160, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 48, "argument_fidelity": 24, "tool_selection": 6, "basic_2step": 17, "sequential_3step": 72, "conditional_routing": 103, "sequential_reasoning": 72, "error_recovery": 0, "data_gap_recovery": 39, "data_gap_recovery_extended": 6, "argument_transformation": 13, "grounded_synthesis": 50, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 27, "tool_selection_stateful": 9, "basic_2step_stateful": 43, "sequential_3step_stateful": 93, "conditional_routing_stateful": 103, "sequential_reasoning_stateful": 36, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 24, "data_gap_recovery_extended_stateful": 7, "argument_transformation_stateful": 10, "grounded_synthesis_stateful": 63, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 9.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 13.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 48, "argument_fidelity": 8, "tool_selection": 5, "basic_2step": 12, "sequential_3step": 42, "conditional_routing": 36, "sequential_reasoning": 19, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 41, "argument_transformation": 29, "grounded_synthesis": 44, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 9, "tool_selection_stateful": 6, "basic_2step_stateful": 28, "sequential_3step_stateful": 46, "conditional_routing_stateful": 41, "sequential_reasoning_stateful": 9, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 23, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 26, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 49}, "scenarioSpeedSum": {"relevance_detection": 85.12, "argument_fidelity": 51.07, "tool_selection": 21.58, "basic_2step": 31.56, "sequential_3step": 298.33, "conditional_routing": 604.32, "sequential_reasoning": 340.05, "error_recovery": 0.0, "data_gap_recovery": 485.13, "data_gap_recovery_extended": 749.57, "argument_transformation": 1401.16, "grounded_synthesis": 1501.74, "inconsistent_api_recovery": 712.55, "relevance_detection_stateful": 95.11, "argument_fidelity_stateful": 57.06, "tool_selection_stateful": 25.67, "basic_2step_stateful": 83.77, "sequential_3step_stateful": 343.55, "conditional_routing_stateful": 728.06, "sequential_reasoning_stateful": 153.09, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 317.25, "data_gap_recovery_extended_stateful": 654.54, "argument_transformation_stateful": 1135.79, "grounded_synthesis_stateful": 1607.12, "inconsistent_api_recovery_stateful": 676.43}, "scenarioSpeedN": {"relevance_detection": 48, "argument_fidelity": 8, "tool_selection": 5, "basic_2step": 12, "sequential_3step": 42, "conditional_routing": 36, "sequential_reasoning": 19, "error_recovery": 0, "data_gap_recovery": 36, "data_gap_recovery_extended": 41, "argument_transformation": 29, "grounded_synthesis": 44, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 9, "tool_selection_stateful": 6, "basic_2step_stateful": 28, "sequential_3step_stateful": 46, "conditional_routing_stateful": 41, "sequential_reasoning_stateful": 9, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 23, "data_gap_recovery_extended_stateful": 36, "argument_transformation_stateful": 26, "grounded_synthesis_stateful": 45, "inconsistent_api_recovery_stateful": 49}}, {"label": "Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [bare]", "model": "Ministral-3-14B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "none", "family": "ministral-14b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 27.6, "accuracy": 82.2, "completeness": 33.6, "efficiency": 100.0, "wasted": 0.0, "speed": 3.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 12, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 98, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 8, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 6, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 6, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 7, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 6, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 7, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 30, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 196, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 20, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 150, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 24, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 250, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 147, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 16, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 6, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 7, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 52.59, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 37.61, "conditional_routing": 147.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 18.88, "data_gap_recovery_extended": 0.0, "argument_transformation": 35.41, "grounded_synthesis": 115.06, "inconsistent_api_recovery": 189.77, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 52.88, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 37.62, "conditional_routing_stateful": 170.34, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 11.9, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 43.42, "grounded_synthesis_stateful": 188.47, "inconsistent_api_recovery_stateful": 189.41}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 6, "data_gap_recovery_extended": 0, "argument_transformation": 4, "grounded_synthesis": 7, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 49, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 12, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.0-h-micro-Q4_K_M LS/P [reforged:full]", "model": "granite-4.0-h-micro-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "reforged", "replay": "full", "family": "granite-4.0-h-micro", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 26.9, "accuracy": 50.0, "completeness": 53.8, "efficiency": 81.6, "wasted": 0.4, "speed": 2.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 250, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 200, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 250, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 203, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 100.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 50.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 100.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 53.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 25.7, "argument_fidelity": 151.34, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 176.38, "conditional_routing": 136.79, "sequential_reasoning": 97.11, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 126.94, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 313.8, "relevance_detection_stateful": 23.99, "argument_fidelity_stateful": 147.54, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 188.05, "conditional_routing_stateful": 154.98, "sequential_reasoning_stateful": 72.57, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 125.42, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 310.44}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}}, {"label": "Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [bare]", "model": "Ministral-3-8B-Instruct-2512-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "none", "family": "ministral-8b", "quant": "q4_K_M", "gen": 3, "retired": false, "score": 27.3, "accuracy": 62.8, "completeness": 43.5, "efficiency": 100.0, "wasted": 0.0, "speed": 3.8, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 100, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 41, "grounded_synthesis": 41, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 41, "grounded_synthesis": 41, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 400, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 150, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 8, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 5, "inconsistent_api_recovery": 250, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 150, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 8, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 41, "grounded_synthesis": 41, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 37.11, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 64.86, "conditional_routing": 105.58, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 8.5, "data_gap_recovery_extended": 0.0, "argument_transformation": 376.28, "grounded_synthesis": 376.24, "inconsistent_api_recovery": 130.75, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 37.34, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 66.5, "conditional_routing_stateful": 108.39, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 8.38, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 396.82, "grounded_synthesis_stateful": 322.63, "inconsistent_api_recovery_stateful": 130.31}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 41, "grounded_synthesis": 41, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 45, "grounded_synthesis_stateful": 34, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.0:h-micro-q4_K_M OL/N [bare:full]", "model": "granite-4.0:h-micro-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "granite-4.0-h-micro", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 23.1, "accuracy": 50.0, "completeness": 46.2, "efficiency": 100.0, "wasted": 2.5, "speed": 5.3, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 100, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 200, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 350.0, "grounded_synthesis": 392.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 350.0, "grounded_synthesis_stateful": 392.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 97.1, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 80.12, "conditional_routing": 0.0, "sequential_reasoning": 120.21, "error_recovery": 0.0, "data_gap_recovery": 7.26, "data_gap_recovery_extended": 0.0, "argument_transformation": 419.95, "grounded_synthesis": 653.26, "inconsistent_api_recovery": 227.45, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 97.66, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 79.56, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 120.66, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 7.27, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 418.69, "grounded_synthesis_stateful": 653.94, "inconsistent_api_recovery_stateful": 226.84}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 49, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 1, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 49, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-7B-Instruct-v0.3.Q4_K_M LF/P [bare:full]", "model": "Mistral-7B-Instruct-v0.3.Q4_K_M", "backend": "llamafile", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 20.2, "accuracy": 22.6, "completeness": 89.7, "efficiency": 100.0, "wasted": 0.0, "speed": 2.9, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 36, "sequential_3step": 10, "conditional_routing": 10, "sequential_reasoning": 74, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 46, "sequential_3step_stateful": 4, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 24, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 50, "basic_2step": 18, "sequential_3step": 5, "conditional_routing": 5, "sequential_reasoning": 37, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 23, "sequential_3step_stateful": 2, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 12, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 150, "basic_2step": 36, "sequential_3step": 15, "conditional_routing": 20, "sequential_reasoning": 148, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 46, "sequential_3step_stateful": 6, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 48, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 21, "sequential_3step": 10, "conditional_routing": 10, "sequential_reasoning": 74, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 32, "sequential_3step_stateful": 4, "conditional_routing_stateful": 24, "sequential_reasoning_stateful": 68, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 26.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 30.79, "argument_fidelity": 55.41, "tool_selection": 57.1, "basic_2step": 40.02, "sequential_3step": 64.78, "conditional_routing": 122.38, "sequential_reasoning": 114.29, "error_recovery": 0.0, "data_gap_recovery": 133.9, "data_gap_recovery_extended": 190.95, "argument_transformation": 285.05, "grounded_synthesis": 317.07, "inconsistent_api_recovery": 228.95, "relevance_detection_stateful": 29.9, "argument_fidelity_stateful": 56.13, "tool_selection_stateful": 57.92, "basic_2step_stateful": 48.67, "sequential_3step_stateful": 58.34, "conditional_routing_stateful": 140.87, "sequential_reasoning_stateful": 121.56, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 123.49, "data_gap_recovery_extended_stateful": 189.64, "argument_transformation_stateful": 306.09, "grounded_synthesis_stateful": 324.61, "inconsistent_api_recovery_stateful": 225.5}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 50, "data_gap_recovery_extended": 48, "argument_transformation": 36, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 35, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-7B-Instruct-v0.3.Q8_0 LF/P [bare:full]", "model": "Mistral-7B-Instruct-v0.3.Q8_0", "backend": "llamafile", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 17.8, "accuracy": 19.6, "completeness": 91.0, "efficiency": 100.0, "wasted": 0.0, "speed": 4.2, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 16, "sequential_3step": 2, "conditional_routing": 30, "sequential_reasoning": 66, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 10, "sequential_3step_stateful": 0, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 2, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 50, "basic_2step": 8, "sequential_3step": 1, "conditional_routing": 15, "sequential_reasoning": 33, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 1, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 5, "sequential_3step_stateful": 0, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 150, "basic_2step": 16, "sequential_3step": 3, "conditional_routing": 60, "sequential_reasoning": 132, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 10, "sequential_3step_stateful": 0, "conditional_routing_stateful": 64, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 10, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 8, "sequential_3step": 2, "conditional_routing": 30, "sequential_reasoning": 66, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 5, "sequential_3step_stateful": 0, "conditional_routing_stateful": 34, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 38.34, "argument_fidelity": 74.39, "tool_selection": 74.94, "basic_2step": 42.95, "sequential_3step": 72.07, "conditional_routing": 164.86, "sequential_reasoning": 151.28, "error_recovery": 0.0, "data_gap_recovery": 197.88, "data_gap_recovery_extended": 273.33, "argument_transformation": 645.87, "grounded_synthesis": 471.26, "inconsistent_api_recovery": 291.14, "relevance_detection_stateful": 39.81, "argument_fidelity_stateful": 74.38, "tool_selection_stateful": 76.57, "basic_2step_stateful": 41.32, "sequential_3step_stateful": 71.47, "conditional_routing_stateful": 169.31, "sequential_reasoning_stateful": 123.13, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 222.38, "data_gap_recovery_extended_stateful": 289.34, "argument_transformation_stateful": 593.36, "grounded_synthesis_stateful": 435.84, "inconsistent_api_recovery_stateful": 292.39}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 49, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 45, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 49, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-7B-Instruct-v0.3-Q4_K_M LS/P [bare:full]", "model": "Mistral-7B-Instruct-v0.3-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 18.1, "accuracy": 20.1, "completeness": 90.0, "efficiency": 100.0, "wasted": 0.0, "speed": 1.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 30, "sequential_3step": 0, "conditional_routing": 4, "sequential_reasoning": 66, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 2, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 56, "sequential_3step_stateful": 0, "conditional_routing_stateful": 4, "sequential_reasoning_stateful": 2, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 50, "basic_2step": 15, "sequential_3step": 0, "conditional_routing": 2, "sequential_reasoning": 33, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 1, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 28, "sequential_3step_stateful": 0, "conditional_routing_stateful": 2, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 46, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 45, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 37, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 46, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 45, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 37, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 150, "basic_2step": 30, "sequential_3step": 0, "conditional_routing": 8, "sequential_reasoning": 132, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 56, "sequential_3step_stateful": 0, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 15, "sequential_3step": 0, "conditional_routing": 10, "sequential_reasoning": 66, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 2, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 28, "sequential_3step_stateful": 0, "conditional_routing_stateful": 7, "sequential_reasoning_stateful": 7, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 2.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 1.0, "sequential_reasoning_stateful": 3.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 46, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 45, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 37, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 13.37, "argument_fidelity": 26.41, "tool_selection": 26.71, "basic_2step": 16.38, "sequential_3step": 25.22, "conditional_routing": 59.84, "sequential_reasoning": 57.18, "error_recovery": 0.0, "data_gap_recovery": 67.48, "data_gap_recovery_extended": 96.99, "argument_transformation": 196.13, "grounded_synthesis": 179.88, "inconsistent_api_recovery": 121.95, "relevance_detection_stateful": 13.9, "argument_fidelity_stateful": 26.76, "tool_selection_stateful": 28.12, "basic_2step_stateful": 16.73, "sequential_3step_stateful": 24.96, "conditional_routing_stateful": 52.97, "sequential_reasoning_stateful": 43.99, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 64.86, "data_gap_recovery_extended_stateful": 77.11, "argument_transformation_stateful": 428.62, "grounded_synthesis_stateful": 172.74, "inconsistent_api_recovery_stateful": 126.46}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 50, "argument_transformation": 46, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 49, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 45, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 44, "argument_transformation_stateful": 37, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Mistral-7B-Instruct-v0.3-Q8_0 LS/P [bare:full]", "model": "Mistral-7B-Instruct-v0.3-Q8_0", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 17.8, "accuracy": 19.6, "completeness": 90.5, "efficiency": 100.0, "wasted": 0.0, "speed": 2.4, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 2, "sequential_3step": 2, "conditional_routing": 40, "sequential_reasoning": 62, "error_recovery": 0, "data_gap_recovery": 8, "data_gap_recovery_extended": 8, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 6, "sequential_3step_stateful": 2, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 50, "basic_2step": 1, "sequential_3step": 1, "conditional_routing": 20, "sequential_reasoning": 31, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 4, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 3, "sequential_3step_stateful": 1, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 43, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 43, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 150, "basic_2step": 2, "sequential_3step": 3, "conditional_routing": 80, "sequential_reasoning": 124, "error_recovery": 0, "data_gap_recovery": 20, "data_gap_recovery_extended": 32, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 6, "sequential_3step_stateful": 3, "conditional_routing_stateful": 64, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 100, "basic_2step": 1, "sequential_3step": 2, "conditional_routing": 40, "sequential_reasoning": 62, "error_recovery": 0, "data_gap_recovery": 8, "data_gap_recovery_extended": 9, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 3, "sequential_3step_stateful": 2, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 43, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 22.55, "argument_fidelity": 42.13, "tool_selection": 41.43, "basic_2step": 22.49, "sequential_3step": 55.55, "conditional_routing": 102.42, "sequential_reasoning": 88.78, "error_recovery": 0.0, "data_gap_recovery": 125.42, "data_gap_recovery_extended": 181.21, "argument_transformation": 383.24, "grounded_synthesis": 231.36, "inconsistent_api_recovery": 167.09, "relevance_detection_stateful": 22.17, "argument_fidelity_stateful": 42.66, "tool_selection_stateful": 42.21, "basic_2step_stateful": 23.24, "sequential_3step_stateful": 39.86, "conditional_routing_stateful": 94.82, "sequential_reasoning_stateful": 69.21, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 144.37, "data_gap_recovery_extended_stateful": 132.23, "argument_transformation_stateful": 338.66, "grounded_synthesis_stateful": 240.48, "inconsistent_api_recovery_stateful": 173.39}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 49, "data_gap_recovery_extended": 49, "argument_transformation": 43, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 47, "argument_transformation_stateful": 39, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "ministral-3:8b-instruct-2512-q8_0 OL/N [bare:full]", "model": "ministral-3:8b-instruct-2512-q8_0", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "ministral-8b", "quant": "q8_0", "gen": 2, "retired": false, "score": 17.8, "accuracy": 49.6, "completeness": 36.0, "efficiency": 93.3, "wasted": 0.4, "speed": 6.8, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 68, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 14, "data_gap_recovery_extended": 0, "argument_transformation": 20, "grounded_synthesis": 0, "inconsistent_api_recovery": 96, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 64, "conditional_routing_stateful": 100, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 34, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 0, "argument_transformation": 10, "grounded_synthesis": 0, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 32, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 34, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 44, "argument_transformation": 16, "grounded_synthesis": 39, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 32, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 17, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 34, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 44, "argument_transformation": 16, "grounded_synthesis": 39, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 32, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 17, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 102, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 35, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 384, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 96, "conditional_routing_stateful": 200, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 5, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 102, "conditional_routing": 248, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 21, "data_gap_recovery_extended": 0, "argument_transformation": 92, "grounded_synthesis": 0, "inconsistent_api_recovery": 334, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 96, "conditional_routing_stateful": 250, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 48.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 48.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 8.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 50.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 34.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 34, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 44, "argument_transformation": 16, "grounded_synthesis": 39, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 32, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 17, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 32.28, "conditional_routing": 198.33, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 42.63, "data_gap_recovery_extended": 357.48, "argument_transformation": 182.43, "grounded_synthesis": 418.92, "inconsistent_api_recovery": 384.02, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 30.3, "conditional_routing_stateful": 194.38, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 314.74, "argument_transformation_stateful": 172.2, "grounded_synthesis_stateful": 459.71, "inconsistent_api_recovery_stateful": 386.94}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 34, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 7, "data_gap_recovery_extended": 44, "argument_transformation": 16, "grounded_synthesis": 39, "inconsistent_api_recovery": 48, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 32, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 38, "argument_transformation_stateful": 17, "grounded_synthesis_stateful": 43, "inconsistent_api_recovery_stateful": 50}}, {"label": "granite-4.0:h-tiny-q4_K_M OL/N [bare:full]", "model": "granite-4.0:h-tiny-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "granite-4.0-h-tiny", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 15.5, "accuracy": 33.2, "completeness": 46.8, "efficiency": 100.0, "wasted": 1.0, "speed": 3.8, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 4, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 2, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 8, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 8, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 112.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 188.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 87.34, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 81.1, "conditional_routing": 158.74, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 6.4, "data_gap_recovery_extended": 0.0, "argument_transformation": 130.11, "grounded_synthesis": 346.43, "inconsistent_api_recovery": 330.28, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 87.09, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 80.82, "conditional_routing_stateful": 158.96, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 12.84, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 127.4, "grounded_synthesis_stateful": 353.58, "inconsistent_api_recovery_stateful": 378.46}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 49, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-27B-Q4_K_M LS/N [bare:full]", "model": "Qwen3.5-27B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "qwen3.5-27b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 15.8, "accuracy": 100.0, "completeness": 15.8, "efficiency": 100.0, "wasted": 0.0, "speed": 11.0, "n": 50, "scenarios": {"relevance_detection": 88, "argument_fidelity": 0, "tool_selection": 12, "basic_2step": 30, "sequential_3step": 2, "conditional_routing": 32, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 16, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 96, "argument_fidelity_stateful": 0, "tool_selection_stateful": 24, "basic_2step_stateful": 56, "sequential_3step_stateful": 4, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 6, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 44, "argument_fidelity": 0, "tool_selection": 6, "basic_2step": 15, "sequential_3step": 1, "conditional_routing": 16, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 0, "tool_selection_stateful": 12, "basic_2step_stateful": 28, "sequential_3step_stateful": 2, "conditional_routing_stateful": 20, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 44, "argument_fidelity": 0, "tool_selection": 6, "basic_2step": 15, "sequential_3step": 1, "conditional_routing": 16, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 0, "tool_selection_stateful": 12, "basic_2step_stateful": 28, "sequential_3step_stateful": 2, "conditional_routing_stateful": 20, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 44, "argument_fidelity": 0, "tool_selection": 6, "basic_2step": 15, "sequential_3step": 1, "conditional_routing": 16, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 0, "tool_selection_stateful": 12, "basic_2step_stateful": 28, "sequential_3step_stateful": 2, "conditional_routing_stateful": 20, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 44, "argument_fidelity": 0, "tool_selection": 18, "basic_2step": 30, "sequential_3step": 3, "conditional_routing": 64, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 80, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 0, "tool_selection_stateful": 36, "basic_2step_stateful": 56, "sequential_3step_stateful": 6, "conditional_routing_stateful": 80, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 44, "argument_fidelity": 0, "tool_selection": 18, "basic_2step": 30, "sequential_3step": 3, "conditional_routing": 32, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 24, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 0, "tool_selection_stateful": 36, "basic_2step_stateful": 56, "sequential_3step_stateful": 6, "conditional_routing_stateful": 40, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 6, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 9, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 44, "argument_fidelity": 0, "tool_selection": 6, "basic_2step": 15, "sequential_3step": 1, "conditional_routing": 16, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 0, "tool_selection_stateful": 12, "basic_2step_stateful": 28, "sequential_3step_stateful": 2, "conditional_routing_stateful": 20, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 191.39, "argument_fidelity": 0.0, "tool_selection": 38.76, "basic_2step": 66.96, "sequential_3step": 8.19, "conditional_routing": 372.94, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 25.39, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 444.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 229.63, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 80.74, "basic_2step_stateful": 133.43, "sequential_3step_stateful": 17.0, "conditional_routing_stateful": 453.29, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 52.57, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 154.58, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 44, "argument_fidelity": 0, "tool_selection": 6, "basic_2step": 15, "sequential_3step": 1, "conditional_routing": 16, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 8, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 48, "argument_fidelity_stateful": 0, "tool_selection_stateful": 12, "basic_2step_stateful": 28, "sequential_3step_stateful": 2, "conditional_routing_stateful": 20, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 3, "inconsistent_api_recovery_stateful": 0}}, {"label": "granite-4.0-h-tiny-Q4_K_M LS/N [bare:full]", "model": "granite-4.0-h-tiny-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "granite-4.0-h-tiny", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 15.4, "accuracy": 33.3, "completeness": 46.2, "efficiency": 100.0, "wasted": 2.6, "speed": 3.5, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 100, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 100, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 100, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 100, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 150, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 150, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 150, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 150, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 500.0, "grounded_synthesis": 150.0, "inconsistent_api_recovery": 147.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 500.0, "grounded_synthesis_stateful": 150.0, "inconsistent_api_recovery_stateful": 98.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 56.19, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 47.39, "conditional_routing": 140.43, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 276.63, "grounded_synthesis": 299.06, "inconsistent_api_recovery": 228.38, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 55.94, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 47.0, "conditional_routing_stateful": 139.78, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 280.87, "grounded_synthesis_stateful": 286.68, "inconsistent_api_recovery_stateful": 215.85}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}}, {"label": "ministral-3:8b-instruct-2512-q4_K_M OL/N [bare:full]", "model": "ministral-3:8b-instruct-2512-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "ministral-8b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 14.2, "accuracy": 45.1, "completeness": 31.4, "efficiency": 90.9, "wasted": 0.2, "speed": 5.2, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 76, "conditional_routing": 100, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 86, "conditional_routing_stateful": 70, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 43, "conditional_routing_stateful": 35, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 43, "grounded_synthesis": 38, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 33, "inconsistent_api_recovery_stateful": 3}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 43, "grounded_synthesis": 38, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 33, "inconsistent_api_recovery_stateful": 3}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 114, "conditional_routing": 200, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 32, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 129, "conditional_routing_stateful": 140, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 55, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 114, "conditional_routing": 244, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 6, "inconsistent_api_recovery": 26, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 129, "conditional_routing_stateful": 175, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 55, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 44.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 35.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 5.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 43, "grounded_synthesis": 38, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 33, "inconsistent_api_recovery_stateful": 3}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 45.05, "conditional_routing": 276.25, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 20.46, "data_gap_recovery_extended": 0.0, "argument_transformation": 220.4, "grounded_synthesis": 370.1, "inconsistent_api_recovery": 28.25, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 49.9, "conditional_routing_stateful": 220.36, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 70.27, "data_gap_recovery_extended_stateful": 255.32, "argument_transformation_stateful": 221.32, "grounded_synthesis_stateful": 312.31, "inconsistent_api_recovery_stateful": 28.44}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 38, "conditional_routing": 50, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 43, "grounded_synthesis": 38, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 43, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 11, "data_gap_recovery_extended_stateful": 49, "argument_transformation_stateful": 42, "grounded_synthesis_stateful": 33, "inconsistent_api_recovery_stateful": 3}}, {"label": "granite-4.0-h-micro-Q4_K_M LS/P [bare:full]", "model": "granite-4.0-h-micro-Q4_K_M", "backend": "llamaserver", "mode": "prompt", "ablation": "bare", "replay": "full", "family": "granite-4.0-h-micro", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 11.5, "accuracy": 21.4, "completeness": 53.8, "efficiency": 100.0, "wasted": 0.0, "speed": 1.7, "n": 50, "scenarios": {"relevance_detection": 100, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 100, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioIdealCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 200, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 50, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 100, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}, "scenarioSpeedSum": {"relevance_detection": 24.28, "argument_fidelity": 38.62, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 66.93, "conditional_routing": 134.76, "sequential_reasoning": 97.67, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 125.96, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 132.54, "relevance_detection_stateful": 23.99, "argument_fidelity_stateful": 38.58, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 66.92, "conditional_routing_stateful": 136.13, "sequential_reasoning_stateful": 72.65, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 125.73, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 132.17}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 50, "grounded_synthesis": 0, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 50}}, {"label": "Qwen3.5-35B-A3B-Q4_K_M LS/N [bare:full]", "model": "Qwen3.5-35B-A3B-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "qwen3.5-35b-a3b", "quant": "q4_K_M", "gen": 2, "retired": false, "score": 12.2, "accuracy": 97.5, "completeness": 12.5, "efficiency": 100.0, "wasted": 0.0, "speed": 3.9, "n": 50, "scenarios": {"relevance_detection": 76, "argument_fidelity": 2, "tool_selection": 0, "basic_2step": 28, "sequential_3step": 8, "conditional_routing": 26, "sequential_reasoning": 6, "error_recovery": 0, "data_gap_recovery": 6, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 2, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 70, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 28, "sequential_3step_stateful": 8, "conditional_routing_stateful": 32, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 8, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 38, "argument_fidelity": 1, "tool_selection": 0, "basic_2step": 14, "sequential_3step": 4, "conditional_routing": 13, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 14, "sequential_3step_stateful": 4, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 5, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 38, "argument_fidelity": 1, "tool_selection": 0, "basic_2step": 14, "sequential_3step": 4, "conditional_routing": 13, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 3, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 14, "sequential_3step_stateful": 4, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 5, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 38, "argument_fidelity": 1, "tool_selection": 0, "basic_2step": 14, "sequential_3step": 4, "conditional_routing": 13, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 3, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 14, "sequential_3step_stateful": 4, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 5, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 38, "argument_fidelity": 3, "tool_selection": 0, "basic_2step": 28, "sequential_3step": 12, "conditional_routing": 52, "sequential_reasoning": 12, "error_recovery": 0, "data_gap_recovery": 15, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 10, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 28, "sequential_3step_stateful": 12, "conditional_routing_stateful": 64, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 8, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 40, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 38, "argument_fidelity": 3, "tool_selection": 0, "basic_2step": 28, "sequential_3step": 12, "conditional_routing": 40, "sequential_reasoning": 12, "error_recovery": 0, "data_gap_recovery": 10, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 3, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 28, "sequential_3step_stateful": 12, "conditional_routing_stateful": 46, "sequential_reasoning_stateful": 20, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 8, "data_gap_recovery_extended_stateful": 4, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 26, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 3.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 3.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 38, "argument_fidelity": 1, "tool_selection": 0, "basic_2step": 14, "sequential_3step": 4, "conditional_routing": 13, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 3, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 14, "sequential_3step_stateful": 4, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 5, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 51.3, "argument_fidelity": 2.5, "tool_selection": 0.0, "basic_2step": 17.65, "sequential_3step": 10.61, "conditional_routing": 105.19, "sequential_reasoning": 10.71, "error_recovery": 0.0, "data_gap_recovery": 25.3, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 52.37, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 46.57, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 18.96, "sequential_3step_stateful": 10.04, "conditional_routing_stateful": 125.63, "sequential_reasoning_stateful": 17.22, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 14.09, "data_gap_recovery_extended_stateful": 7.63, "argument_transformation_stateful": 24.25, "grounded_synthesis_stateful": 95.47, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 38, "argument_fidelity": 1, "tool_selection": 0, "basic_2step": 14, "sequential_3step": 4, "conditional_routing": 13, "sequential_reasoning": 3, "error_recovery": 0, "data_gap_recovery": 3, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 3, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 14, "sequential_3step_stateful": 4, "conditional_routing_stateful": 16, "sequential_reasoning_stateful": 5, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 1, "argument_transformation_stateful": 1, "grounded_synthesis_stateful": 5, "inconsistent_api_recovery_stateful": 0}}, {"label": "mistral:7b-instruct-v0.3-q4_K_M OL/N [reforged:full]", "model": "mistral:7b-instruct-v0.3-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 6.3, "accuracy": 39.6, "completeness": 15.9, "efficiency": 62.6, "wasted": 2.6, "speed": 6.5, "n": 50, "scenarios": {"relevance_detection": 14, "argument_fidelity": 4, "tool_selection": 0, "basic_2step": 12, "sequential_3step": 44, "conditional_routing": 0, "sequential_reasoning": 10, "error_recovery": 2, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 2, "tool_selection_stateful": 0, "basic_2step_stateful": 6, "sequential_3step_stateful": 50, "conditional_routing_stateful": 4, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 4, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 7, "argument_fidelity": 2, "tool_selection": 0, "basic_2step": 6, "sequential_3step": 22, "conditional_routing": 0, "sequential_reasoning": 5, "error_recovery": 1, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 1, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 1, "tool_selection_stateful": 0, "basic_2step_stateful": 3, "sequential_3step_stateful": 25, "conditional_routing_stateful": 2, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 2, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 7, "argument_fidelity": 10, "tool_selection": 0, "basic_2step": 7, "sequential_3step": 28, "conditional_routing": 10, "sequential_reasoning": 11, "error_recovery": 14, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 4, "sequential_3step_stateful": 29, "conditional_routing_stateful": 6, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 12, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 19}, "scenarioValidated": {"relevance_detection": 7, "argument_fidelity": 10, "tool_selection": 0, "basic_2step": 7, "sequential_3step": 28, "conditional_routing": 10, "sequential_reasoning": 11, "error_recovery": 14, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 4, "sequential_3step_stateful": 29, "conditional_routing_stateful": 6, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 12, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 19}, "scenarioIdealCalls": {"relevance_detection": 7, "argument_fidelity": 6, "tool_selection": 0, "basic_2step": 12, "sequential_3step": 66, "conditional_routing": 0, "sequential_reasoning": 20, "error_recovery": 2, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 8, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 3, "tool_selection_stateful": 0, "basic_2step_stateful": 6, "sequential_3step_stateful": 75, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 6, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 13, "argument_fidelity": 10, "tool_selection": 0, "basic_2step": 18, "sequential_3step": 101, "conditional_routing": 0, "sequential_reasoning": 30, "error_recovery": 5, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 14, "relevance_detection_stateful": 8, "argument_fidelity_stateful": 4, "tool_selection_stateful": 0, "basic_2step_stateful": 17, "sequential_3step_stateful": 111, "conditional_routing_stateful": 13, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 14, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 6.0, "argument_fidelity": 42.0, "tool_selection": 0.0, "basic_2step": 7.0, "sequential_3step": 45.0, "conditional_routing": 46.0, "sequential_reasoning": 23.0, "error_recovery": 84.0, "data_gap_recovery": 3.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 2.0, "inconsistent_api_recovery": 56.0, "relevance_detection_stateful": 4.0, "argument_fidelity_stateful": 61.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 16.0, "sequential_3step_stateful": 40.0, "conditional_routing_stateful": 15.0, "sequential_reasoning_stateful": 2.0, "error_recovery_stateful": 51.0, "data_gap_recovery_stateful": 3.0, "data_gap_recovery_extended_stateful": 2.0, "argument_transformation_stateful": 1.0, "grounded_synthesis_stateful": 2.0, "inconsistent_api_recovery_stateful": 34.0}, "scenarioWastedN": {"relevance_detection": 7, "argument_fidelity": 10, "tool_selection": 0, "basic_2step": 7, "sequential_3step": 28, "conditional_routing": 10, "sequential_reasoning": 11, "error_recovery": 14, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 4, "sequential_3step_stateful": 29, "conditional_routing_stateful": 6, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 12, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 19}, "scenarioSpeedSum": {"relevance_detection": 5.57, "argument_fidelity": 37.33, "tool_selection": 0.0, "basic_2step": 9.67, "sequential_3step": 196.78, "conditional_routing": 80.48, "sequential_reasoning": 36.64, "error_recovery": 72.39, "data_gap_recovery": 31.36, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 19.6, "inconsistent_api_recovery": 239.08, "relevance_detection_stateful": 5.85, "argument_fidelity_stateful": 40.87, "tool_selection_stateful": 0.0, "basic_2step_stateful": 10.5, "sequential_3step_stateful": 157.16, "conditional_routing_stateful": 39.7, "sequential_reasoning_stateful": 3.43, "error_recovery_stateful": 56.01, "data_gap_recovery_stateful": 11.58, "data_gap_recovery_extended_stateful": 18.14, "argument_transformation_stateful": 24.92, "grounded_synthesis_stateful": 33.24, "inconsistent_api_recovery_stateful": 222.02}, "scenarioSpeedN": {"relevance_detection": 7, "argument_fidelity": 10, "tool_selection": 0, "basic_2step": 7, "sequential_3step": 28, "conditional_routing": 10, "sequential_reasoning": 11, "error_recovery": 14, "data_gap_recovery": 4, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 1, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 4, "sequential_3step_stateful": 29, "conditional_routing_stateful": 6, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 12, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 2, "argument_transformation_stateful": 3, "grounded_synthesis_stateful": 1, "inconsistent_api_recovery_stateful": 19}}, {"label": "mistral:7b-instruct-v0.3-q8_0 OL/N [reforged:full]", "model": "mistral:7b-instruct-v0.3-q8_0", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 5.7, "accuracy": 42.3, "completeness": 13.5, "efficiency": 62.7, "wasted": 3.2, "speed": 9.4, "n": 50, "scenarios": {"relevance_detection": 14, "argument_fidelity": 2, "tool_selection": 2, "basic_2step": 20, "sequential_3step": 34, "conditional_routing": 4, "sequential_reasoning": 10, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 10, "argument_fidelity_stateful": 2, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 44, "conditional_routing_stateful": 6, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 7, "argument_fidelity": 1, "tool_selection": 1, "basic_2step": 10, "sequential_3step": 17, "conditional_routing": 2, "sequential_reasoning": 5, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 1, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 22, "conditional_routing_stateful": 3, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 7, "argument_fidelity": 7, "tool_selection": 1, "basic_2step": 10, "sequential_3step": 21, "conditional_routing": 13, "sequential_reasoning": 12, "error_recovery": 13, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 6, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 8, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 25, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 7}, "scenarioValidated": {"relevance_detection": 7, "argument_fidelity": 7, "tool_selection": 1, "basic_2step": 10, "sequential_3step": 21, "conditional_routing": 13, "sequential_reasoning": 12, "error_recovery": 13, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 6, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 8, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 25, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 7}, "scenarioIdealCalls": {"relevance_detection": 7, "argument_fidelity": 3, "tool_selection": 3, "basic_2step": 20, "sequential_3step": 51, "conditional_routing": 8, "sequential_reasoning": 20, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 5, "argument_fidelity_stateful": 3, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 66, "conditional_routing_stateful": 12, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 16, "argument_fidelity": 5, "tool_selection": 6, "basic_2step": 39, "sequential_3step": 72, "conditional_routing": 17, "sequential_reasoning": 31, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 12, "argument_fidelity_stateful": 4, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 91, "conditional_routing_stateful": 23, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 9.0, "argument_fidelity": 55.0, "tool_selection": 3.0, "basic_2step": 19.0, "sequential_3step": 28.0, "conditional_routing": 58.0, "sequential_reasoning": 35.0, "error_recovery": 62.0, "data_gap_recovery": 13.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 7.0, "grounded_synthesis": 15.0, "inconsistent_api_recovery": 26.0, "relevance_detection_stateful": 17.0, "argument_fidelity_stateful": 77.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 34.0, "conditional_routing_stateful": 37.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 38.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 4.0, "inconsistent_api_recovery_stateful": 15.0}, "scenarioWastedN": {"relevance_detection": 7, "argument_fidelity": 7, "tool_selection": 1, "basic_2step": 10, "sequential_3step": 21, "conditional_routing": 13, "sequential_reasoning": 12, "error_recovery": 13, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 6, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 8, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 25, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 7}, "scenarioSpeedSum": {"relevance_detection": 15.16, "argument_fidelity": 46.53, "tool_selection": 7.07, "basic_2step": 19.31, "sequential_3step": 148.65, "conditional_routing": 161.98, "sequential_reasoning": 74.57, "error_recovery": 75.37, "data_gap_recovery": 43.5, "data_gap_recovery_extended": 0.0, "argument_transformation": 20.95, "grounded_synthesis": 135.52, "inconsistent_api_recovery": 195.56, "relevance_detection_stateful": 19.46, "argument_fidelity_stateful": 66.19, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 187.68, "conditional_routing_stateful": 104.46, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 60.51, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 102.09, "inconsistent_api_recovery_stateful": 164.64}, "scenarioSpeedN": {"relevance_detection": 7, "argument_fidelity": 7, "tool_selection": 1, "basic_2step": 10, "sequential_3step": 21, "conditional_routing": 13, "sequential_reasoning": 12, "error_recovery": 13, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 6, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 8, "argument_fidelity_stateful": 11, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 25, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 4, "inconsistent_api_recovery_stateful": 7}}, {"label": "llama3.1:8b-instruct-q4_K_M OL/N [reforged:full]", "model": "llama3.1:8b-instruct-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 4.7, "accuracy": 7.2, "completeness": 64.8, "efficiency": 45.1, "wasted": 3.2, "speed": 3.7, "n": 50, "scenarios": {"relevance_detection": 8, "argument_fidelity": 0, "tool_selection": 2, "basic_2step": 34, "sequential_3step": 4, "conditional_routing": 0, "sequential_reasoning": 6, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 4, "relevance_detection_stateful": 12, "argument_fidelity_stateful": 8, "tool_selection_stateful": 6, "basic_2step_stateful": 14, "sequential_3step_stateful": 6, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 6, "data_gap_recovery_stateful": 4, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 4}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 4, "argument_fidelity": 0, "tool_selection": 1, "basic_2step": 17, "sequential_3step": 2, "conditional_routing": 0, "sequential_reasoning": 3, "error_recovery": 2, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 4, "tool_selection_stateful": 3, "basic_2step_stateful": 7, "sequential_3step_stateful": 3, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 3, "data_gap_recovery_stateful": 2, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 2}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 43, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 32, "sequential_reasoning": 30, "error_recovery": 18, "data_gap_recovery": 33, "data_gap_recovery_extended": 26, "argument_transformation": 29, "grounded_synthesis": 48, "inconsistent_api_recovery": 18, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 30, "tool_selection_stateful": 37, "basic_2step_stateful": 49, "sequential_3step_stateful": 42, "conditional_routing_stateful": 22, "sequential_reasoning_stateful": 6, "error_recovery_stateful": 16, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 19, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 16}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 43, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 32, "sequential_reasoning": 30, "error_recovery": 18, "data_gap_recovery": 33, "data_gap_recovery_extended": 26, "argument_transformation": 29, "grounded_synthesis": 48, "inconsistent_api_recovery": 18, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 30, "tool_selection_stateful": 37, "basic_2step_stateful": 49, "sequential_3step_stateful": 42, "conditional_routing_stateful": 22, "sequential_reasoning_stateful": 6, "error_recovery_stateful": 16, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 19, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 16}, "scenarioIdealCalls": {"relevance_detection": 4, "argument_fidelity": 0, "tool_selection": 3, "basic_2step": 34, "sequential_3step": 6, "conditional_routing": 0, "sequential_reasoning": 12, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 16, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 12, "tool_selection_stateful": 9, "basic_2step_stateful": 14, "sequential_3step_stateful": 9, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 10, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 16}, "scenarioActualCalls": {"relevance_detection": 11, "argument_fidelity": 0, "tool_selection": 6, "basic_2step": 51, "sequential_3step": 17, "conditional_routing": 0, "sequential_reasoning": 28, "error_recovery": 12, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 38, "relevance_detection_stateful": 12, "argument_fidelity_stateful": 42, "tool_selection_stateful": 26, "basic_2step_stateful": 22, "sequential_3step_stateful": 15, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 21, "data_gap_recovery_stateful": 25, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 38}, "scenarioWastedSum": {"relevance_detection": 7.0, "argument_fidelity": 246.0, "tool_selection": 57.0, "basic_2step": 67.0, "sequential_3step": 145.0, "conditional_routing": 126.0, "sequential_reasoning": 217.0, "error_recovery": 90.0, "data_gap_recovery": 165.0, "data_gap_recovery_extended": 111.0, "argument_transformation": 49.0, "grounded_synthesis": 47.0, "inconsistent_api_recovery": 139.0, "relevance_detection_stateful": 16.0, "argument_fidelity_stateful": 191.0, "tool_selection_stateful": 174.0, "basic_2step_stateful": 67.0, "sequential_3step_stateful": 149.0, "conditional_routing_stateful": 114.0, "sequential_reasoning_stateful": 56.0, "error_recovery_stateful": 72.0, "data_gap_recovery_stateful": 61.0, "data_gap_recovery_extended_stateful": 104.0, "argument_transformation_stateful": 49.0, "grounded_synthesis_stateful": 42.0, "inconsistent_api_recovery_stateful": 141.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 43, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 32, "sequential_reasoning": 30, "error_recovery": 18, "data_gap_recovery": 33, "data_gap_recovery_extended": 26, "argument_transformation": 29, "grounded_synthesis": 48, "inconsistent_api_recovery": 18, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 30, "tool_selection_stateful": 37, "basic_2step_stateful": 49, "sequential_3step_stateful": 42, "conditional_routing_stateful": 22, "sequential_reasoning_stateful": 6, "error_recovery_stateful": 16, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 19, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 16}, "scenarioSpeedSum": {"relevance_detection": 33.41, "argument_fidelity": 150.41, "tool_selection": 103.05, "basic_2step": 48.53, "sequential_3step": 216.58, "conditional_routing": 109.19, "sequential_reasoning": 124.93, "error_recovery": 90.69, "data_gap_recovery": 136.57, "data_gap_recovery_extended": 164.42, "argument_transformation": 164.24, "grounded_synthesis": 289.69, "inconsistent_api_recovery": 162.33, "relevance_detection_stateful": 34.4, "argument_fidelity_stateful": 111.24, "tool_selection_stateful": 122.42, "basic_2step_stateful": 39.97, "sequential_3step_stateful": 148.81, "conditional_routing_stateful": 80.33, "sequential_reasoning_stateful": 27.92, "error_recovery_stateful": 68.91, "data_gap_recovery_stateful": 64.2, "data_gap_recovery_extended_stateful": 136.16, "argument_transformation_stateful": 101.58, "grounded_synthesis_stateful": 252.92, "inconsistent_api_recovery_stateful": 145.25}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 43, "tool_selection": 46, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 32, "sequential_reasoning": 30, "error_recovery": 18, "data_gap_recovery": 33, "data_gap_recovery_extended": 26, "argument_transformation": 29, "grounded_synthesis": 48, "inconsistent_api_recovery": 18, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 30, "tool_selection_stateful": 37, "basic_2step_stateful": 49, "sequential_3step_stateful": 42, "conditional_routing_stateful": 22, "sequential_reasoning_stateful": 6, "error_recovery_stateful": 16, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 19, "argument_transformation_stateful": 19, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 16}}, {"label": "Mistral-Nemo-Instruct-2407-Q4_K_M LS/N [reforged:full]", "model": "Mistral-Nemo-Instruct-2407-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 5.3, "accuracy": 100.0, "completeness": 5.3, "efficiency": 40.8, "wasted": 1.4, "speed": 1.1, "n": 50, "scenarios": {"relevance_detection": 68, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 70, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 34, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 34, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 34, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 34, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 84, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 85, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 50.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 50.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 34, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 39.35, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 38.07, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 34, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 35, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}}, {"label": "llama3.1:8b-instruct-q8_0 OL/N [reforged:full]", "model": "llama3.1:8b-instruct-q8_0", "backend": "ollama", "mode": "native", "ablation": "reforged", "replay": "full", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 4.3, "accuracy": 8.2, "completeness": 52.5, "efficiency": 42.7, "wasted": 2.9, "speed": 4.9, "n": 50, "scenarios": {"relevance_detection": 8, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 20, "sequential_3step": 14, "conditional_routing": 0, "sequential_reasoning": 22, "error_recovery": 0, "data_gap_recovery": 2, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 4, "tool_selection_stateful": 12, "basic_2step_stateful": 10, "sequential_3step_stateful": 8, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 6, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 4, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 10, "sequential_3step": 7, "conditional_routing": 0, "sequential_reasoning": 11, "error_recovery": 0, "data_gap_recovery": 1, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 3, "argument_fidelity_stateful": 2, "tool_selection_stateful": 6, "basic_2step_stateful": 5, "sequential_3step_stateful": 4, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 3, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 50, "argument_fidelity": 21, "tool_selection": 43, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 20, "sequential_reasoning": 34, "error_recovery": 5, "data_gap_recovery": 33, "data_gap_recovery_extended": 16, "argument_transformation": 8, "grounded_synthesis": 48, "inconsistent_api_recovery": 9, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 21, "tool_selection_stateful": 24, "basic_2step_stateful": 50, "sequential_3step_stateful": 42, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 8, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 4}, "scenarioValidated": {"relevance_detection": 50, "argument_fidelity": 21, "tool_selection": 43, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 20, "sequential_reasoning": 34, "error_recovery": 5, "data_gap_recovery": 33, "data_gap_recovery_extended": 16, "argument_transformation": 8, "grounded_synthesis": 48, "inconsistent_api_recovery": 9, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 21, "tool_selection_stateful": 24, "basic_2step_stateful": 50, "sequential_3step_stateful": 42, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 8, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 4}, "scenarioIdealCalls": {"relevance_detection": 4, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 20, "sequential_3step": 21, "conditional_routing": 0, "sequential_reasoning": 44, "error_recovery": 0, "data_gap_recovery": 5, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 3, "argument_fidelity_stateful": 6, "tool_selection_stateful": 18, "basic_2step_stateful": 10, "sequential_3step_stateful": 12, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 9, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 8, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 31, "sequential_3step": 63, "conditional_routing": 0, "sequential_reasoning": 112, "error_recovery": 0, "data_gap_recovery": 12, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 6, "argument_fidelity_stateful": 16, "tool_selection_stateful": 56, "basic_2step_stateful": 15, "sequential_3step_stateful": 22, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 15, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 5.0, "argument_fidelity": 157.0, "tool_selection": 57.0, "basic_2step": 85.0, "sequential_3step": 159.0, "conditional_routing": 52.0, "sequential_reasoning": 216.0, "error_recovery": 30.0, "data_gap_recovery": 123.0, "data_gap_recovery_extended": 91.0, "argument_transformation": 16.0, "grounded_synthesis": 77.0, "inconsistent_api_recovery": 91.0, "relevance_detection_stateful": 7.0, "argument_fidelity_stateful": 127.0, "tool_selection_stateful": 131.0, "basic_2step_stateful": 82.0, "sequential_3step_stateful": 130.0, "conditional_routing_stateful": 39.0, "sequential_reasoning_stateful": 74.0, "error_recovery_stateful": 28.0, "data_gap_recovery_stateful": 66.0, "data_gap_recovery_extended_stateful": 44.0, "argument_transformation_stateful": 35.0, "grounded_synthesis_stateful": 27.0, "inconsistent_api_recovery_stateful": 34.0}, "scenarioWastedN": {"relevance_detection": 50, "argument_fidelity": 21, "tool_selection": 43, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 20, "sequential_reasoning": 34, "error_recovery": 5, "data_gap_recovery": 33, "data_gap_recovery_extended": 16, "argument_transformation": 8, "grounded_synthesis": 48, "inconsistent_api_recovery": 9, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 21, "tool_selection_stateful": 24, "basic_2step_stateful": 50, "sequential_3step_stateful": 42, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 8, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 4}, "scenarioSpeedSum": {"relevance_detection": 43.59, "argument_fidelity": 120.73, "tool_selection": 143.45, "basic_2step": 82.63, "sequential_3step": 315.07, "conditional_routing": 103.27, "sequential_reasoning": 205.43, "error_recovery": 37.73, "data_gap_recovery": 174.04, "data_gap_recovery_extended": 161.48, "argument_transformation": 72.96, "grounded_synthesis": 408.76, "inconsistent_api_recovery": 114.74, "relevance_detection_stateful": 48.6, "argument_fidelity_stateful": 109.38, "tool_selection_stateful": 118.69, "basic_2step_stateful": 63.48, "sequential_3step_stateful": 235.0, "conditional_routing_stateful": 49.48, "sequential_reasoning_stateful": 53.81, "error_recovery_stateful": 55.3, "data_gap_recovery_stateful": 86.88, "data_gap_recovery_extended_stateful": 82.56, "argument_transformation_stateful": 78.22, "grounded_synthesis_stateful": 326.05, "inconsistent_api_recovery_stateful": 55.58}, "scenarioSpeedN": {"relevance_detection": 50, "argument_fidelity": 21, "tool_selection": 43, "basic_2step": 50, "sequential_3step": 49, "conditional_routing": 20, "sequential_reasoning": 34, "error_recovery": 5, "data_gap_recovery": 33, "data_gap_recovery_extended": 16, "argument_transformation": 8, "grounded_synthesis": 48, "inconsistent_api_recovery": 9, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 21, "tool_selection_stateful": 24, "basic_2step_stateful": 50, "sequential_3step_stateful": 42, "conditional_routing_stateful": 8, "sequential_reasoning_stateful": 10, "error_recovery_stateful": 8, "data_gap_recovery_stateful": 17, "data_gap_recovery_extended_stateful": 9, "argument_transformation_stateful": 6, "grounded_synthesis_stateful": 48, "inconsistent_api_recovery_stateful": 4}}, {"label": "mistral:7b-instruct-v0.3-q8_0 OL/N [bare:full]", "model": "mistral:7b-instruct-v0.3-q8_0", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 2.6, "accuracy": 11.4, "completeness": 22.9, "efficiency": 100.0, "wasted": 0.0, "speed": 1.4, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 68, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 34, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 44, "tool_selection": 11, "basic_2step": 44, "sequential_3step": 37, "conditional_routing": 1, "sequential_reasoning": 7, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 7, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 37, "tool_selection_stateful": 10, "basic_2step_stateful": 48, "sequential_3step_stateful": 34, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 4, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 6}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 44, "tool_selection": 11, "basic_2step": 44, "sequential_3step": 37, "conditional_routing": 1, "sequential_reasoning": 7, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 7, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 37, "tool_selection_stateful": 10, "basic_2step_stateful": 48, "sequential_3step_stateful": 34, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 4, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 6}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 68, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 34, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 44, "tool_selection": 11, "basic_2step": 44, "sequential_3step": 37, "conditional_routing": 1, "sequential_reasoning": 7, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 7, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 37, "tool_selection_stateful": 10, "basic_2step_stateful": 48, "sequential_3step_stateful": 34, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 4, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 6}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 55.81, "tool_selection": 14.95, "basic_2step": 30.51, "sequential_3step": 66.98, "conditional_routing": 4.25, "sequential_reasoning": 18.25, "error_recovery": 4.08, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 19.97, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 40.79, "tool_selection_stateful": 13.16, "basic_2step_stateful": 32.99, "sequential_3step_stateful": 68.75, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 11.09, "error_recovery_stateful": 5.63, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 17.85}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 44, "tool_selection": 11, "basic_2step": 44, "sequential_3step": 37, "conditional_routing": 1, "sequential_reasoning": 7, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 7, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 37, "tool_selection_stateful": 10, "basic_2step_stateful": 48, "sequential_3step_stateful": 34, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 4, "error_recovery_stateful": 4, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 6}}, {"label": "mistral:7b-instruct-v0.3-q4_K_M OL/N [bare:full]", "model": "mistral:7b-instruct-v0.3-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 2.7, "accuracy": 12.0, "completeness": 22.5, "efficiency": 100.0, "wasted": 0.0, "speed": 1.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 70, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 35, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 36, "tool_selection": 17, "basic_2step": 38, "sequential_3step": 27, "conditional_routing": 1, "sequential_reasoning": 4, "error_recovery": 12, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 2, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 31, "tool_selection_stateful": 14, "basic_2step_stateful": 43, "sequential_3step_stateful": 34, "conditional_routing_stateful": 1, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 11, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 9}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 36, "tool_selection": 17, "basic_2step": 38, "sequential_3step": 27, "conditional_routing": 1, "sequential_reasoning": 4, "error_recovery": 12, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 2, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 31, "tool_selection_stateful": 14, "basic_2step_stateful": 43, "sequential_3step_stateful": 34, "conditional_routing_stateful": 1, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 11, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 9}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 70, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 35, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 36, "tool_selection": 17, "basic_2step": 38, "sequential_3step": 27, "conditional_routing": 1, "sequential_reasoning": 4, "error_recovery": 12, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 2, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 31, "tool_selection_stateful": 14, "basic_2step_stateful": 43, "sequential_3step_stateful": 34, "conditional_routing_stateful": 1, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 11, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 9}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 26.47, "tool_selection": 14.46, "basic_2step": 18.61, "sequential_3step": 33.73, "conditional_routing": 3.14, "sequential_reasoning": 8.16, "error_recovery": 9.7, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 3.13, "grounded_synthesis": 8.19, "inconsistent_api_recovery": 18.65, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 26.43, "tool_selection_stateful": 14.14, "basic_2step_stateful": 21.86, "sequential_3step_stateful": 48.71, "conditional_routing_stateful": 2.37, "sequential_reasoning_stateful": 1.06, "error_recovery_stateful": 6.98, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 15.71}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 36, "tool_selection": 17, "basic_2step": 38, "sequential_3step": 27, "conditional_routing": 1, "sequential_reasoning": 4, "error_recovery": 12, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 1, "grounded_synthesis": 2, "inconsistent_api_recovery": 10, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 31, "tool_selection_stateful": 14, "basic_2step_stateful": 43, "sequential_3step_stateful": 34, "conditional_routing_stateful": 1, "sequential_reasoning_stateful": 1, "error_recovery_stateful": 11, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 9}}, {"label": "llama3.1:8b-instruct-q4_K_M OL/N [bare:full]", "model": "llama3.1:8b-instruct-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "llama3.1", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 0.5, "accuracy": 2.2, "completeness": 24.1, "efficiency": 100.0, "wasted": 0.0, "speed": 1.2, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 14, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 7, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 6, "tool_selection": 0, "basic_2step": 1, "sequential_3step": 3, "conditional_routing": 20, "sequential_reasoning": 0, "error_recovery": 30, "data_gap_recovery": 29, "data_gap_recovery_extended": 17, "argument_transformation": 16, "grounded_synthesis": 32, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 5, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 8, "conditional_routing_stateful": 17, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 28, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 4}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 6, "tool_selection": 0, "basic_2step": 1, "sequential_3step": 3, "conditional_routing": 20, "sequential_reasoning": 0, "error_recovery": 30, "data_gap_recovery": 29, "data_gap_recovery_extended": 17, "argument_transformation": 16, "grounded_synthesis": 32, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 5, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 8, "conditional_routing_stateful": 17, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 28, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 4}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 14, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 7, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 6, "tool_selection": 0, "basic_2step": 1, "sequential_3step": 3, "conditional_routing": 20, "sequential_reasoning": 0, "error_recovery": 30, "data_gap_recovery": 29, "data_gap_recovery_extended": 17, "argument_transformation": 16, "grounded_synthesis": 32, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 5, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 8, "conditional_routing_stateful": 17, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 28, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 4}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 7.62, "tool_selection": 0.0, "basic_2step": 0.51, "sequential_3step": 1.95, "conditional_routing": 25.53, "sequential_reasoning": 0.0, "error_recovery": 18.11, "data_gap_recovery": 24.0, "data_gap_recovery_extended": 18.25, "argument_transformation": 37.71, "grounded_synthesis": 58.91, "inconsistent_api_recovery": 3.04, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 3.23, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 6.24, "conditional_routing_stateful": 12.73, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 19.1, "data_gap_recovery_stateful": 25.59, "data_gap_recovery_extended_stateful": 23.59, "argument_transformation_stateful": 25.13, "grounded_synthesis_stateful": 59.69, "inconsistent_api_recovery_stateful": 6.8}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 6, "tool_selection": 0, "basic_2step": 1, "sequential_3step": 3, "conditional_routing": 20, "sequential_reasoning": 0, "error_recovery": 30, "data_gap_recovery": 29, "data_gap_recovery_extended": 17, "argument_transformation": 16, "grounded_synthesis": 32, "inconsistent_api_recovery": 2, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 5, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 8, "conditional_routing_stateful": 17, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 37, "data_gap_recovery_stateful": 28, "data_gap_recovery_extended_stateful": 20, "argument_transformation_stateful": 8, "grounded_synthesis_stateful": 30, "inconsistent_api_recovery_stateful": 4}}, {"label": "llama3.1:8b-instruct-q8_0 OL/N [bare:full]", "model": "llama3.1:8b-instruct-q8_0", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "llama3.1", "quant": "q8_0", "gen": 1, "retired": true, "score": 0.2, "accuracy": 0.5, "completeness": 29.4, "efficiency": 100.0, "wasted": 0.0, "speed": 1.6, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 2, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 3, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 15, "conditional_routing": 20, "sequential_reasoning": 1, "error_recovery": 41, "data_gap_recovery": 32, "data_gap_recovery_extended": 15, "argument_transformation": 7, "grounded_synthesis": 40, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 2, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 15, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 15, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 19}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 3, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 15, "conditional_routing": 20, "sequential_reasoning": 1, "error_recovery": 41, "data_gap_recovery": 32, "data_gap_recovery_extended": 15, "argument_transformation": 7, "grounded_synthesis": 40, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 2, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 15, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 15, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 19}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 4, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 2, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 3, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 15, "conditional_routing": 20, "sequential_reasoning": 1, "error_recovery": 41, "data_gap_recovery": 32, "data_gap_recovery_extended": 15, "argument_transformation": 7, "grounded_synthesis": 40, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 2, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 15, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 15, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 19}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 4.89, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 15.55, "conditional_routing": 30.56, "sequential_reasoning": 1.51, "error_recovery": 28.58, "data_gap_recovery": 42.56, "data_gap_recovery_extended": 50.7, "argument_transformation": 23.87, "grounded_synthesis": 78.96, "inconsistent_api_recovery": 43.42, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 3.45, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 14.21, "conditional_routing_stateful": 37.91, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 32.3, "data_gap_recovery_stateful": 26.8, "data_gap_recovery_extended_stateful": 26.64, "argument_transformation_stateful": 6.75, "grounded_synthesis_stateful": 87.16, "inconsistent_api_recovery_stateful": 43.5}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 3, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 15, "conditional_routing": 20, "sequential_reasoning": 1, "error_recovery": 41, "data_gap_recovery": 32, "data_gap_recovery_extended": 15, "argument_transformation": 7, "grounded_synthesis": 40, "inconsistent_api_recovery": 19, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 2, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 15, "conditional_routing_stateful": 21, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 47, "data_gap_recovery_stateful": 22, "data_gap_recovery_extended_stateful": 15, "argument_transformation_stateful": 2, "grounded_synthesis_stateful": 46, "inconsistent_api_recovery_stateful": 19}}, {"label": "Mistral-7B-Instruct-v0.3-Q4_K_M LS/N [reforged:full]", "model": "Mistral-7B-Instruct-v0.3-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 0.0, "accuracy": null, "completeness": 0.0, "efficiency": 0.0, "wasted": 0.0, "speed": 0.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}}, {"label": "Mistral-7B-Instruct-v0.3-Q4_K_M LS/N [bare:full]", "model": "Mistral-7B-Instruct-v0.3-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "mistral-v0.3", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 0.0, "accuracy": null, "completeness": 0.0, "efficiency": 0.0, "wasted": 0.0, "speed": 0.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}}, {"label": "Mistral-7B-Instruct-v0.3-Q8_0 LS/N [bare:full]", "model": "Mistral-7B-Instruct-v0.3-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 0.0, "accuracy": null, "completeness": 0.0, "efficiency": 0.0, "wasted": 0.0, "speed": 0.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}}, {"label": "mistral-nemo:12b-instruct-2407-q4_K_M OL/N [bare:full]", "model": "mistral-nemo:12b-instruct-2407-q4_K_M", "backend": "ollama", "mode": "native", "ablation": "bare", "replay": "full", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 0.0, "accuracy": null, "completeness": 0.0, "efficiency": 0.0, "wasted": 0.0, "speed": 0.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}}, {"label": "Mistral-Nemo-Instruct-2407-Q4_K_M LS/N [bare:full]", "model": "Mistral-Nemo-Instruct-2407-Q4_K_M", "backend": "llamaserver", "mode": "native", "ablation": "bare", "replay": "full", "family": "mistral-nemo", "quant": "q4_K_M", "gen": 1, "retired": true, "score": 0.0, "accuracy": null, "completeness": 0.0, "efficiency": 0.0, "wasted": 0.0, "speed": 0.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}}, {"label": "Mistral-7B-Instruct-v0.3-Q8_0 LS/N [reforged:full]", "model": "Mistral-7B-Instruct-v0.3-Q8_0", "backend": "llamaserver", "mode": "native", "ablation": "reforged", "replay": "full", "family": "mistral-v0.3", "quant": "q8_0", "gen": 1, "retired": true, "score": 0.0, "accuracy": null, "completeness": 0.0, "efficiency": 0.0, "wasted": 0.0, "speed": 0.0, "n": 50, "scenarios": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioRuns": {"relevance_detection": 50, "argument_fidelity": 50, "tool_selection": 50, "basic_2step": 50, "sequential_3step": 50, "conditional_routing": 50, "sequential_reasoning": 50, "error_recovery": 50, "data_gap_recovery": 50, "data_gap_recovery_extended": 50, "argument_transformation": 50, "grounded_synthesis": 50, "inconsistent_api_recovery": 50, "relevance_detection_stateful": 50, "argument_fidelity_stateful": 50, "tool_selection_stateful": 50, "basic_2step_stateful": 50, "sequential_3step_stateful": 50, "conditional_routing_stateful": 50, "sequential_reasoning_stateful": 50, "error_recovery_stateful": 50, "data_gap_recovery_stateful": 50, "data_gap_recovery_extended_stateful": 50, "argument_transformation_stateful": 50, "grounded_synthesis_stateful": 50, "inconsistent_api_recovery_stateful": 50}, "scenarioCorrect": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioCompleted": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioValidated": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioIdealCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioActualCalls": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioWastedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioWastedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}, "scenarioSpeedSum": {"relevance_detection": 0.0, "argument_fidelity": 0.0, "tool_selection": 0.0, "basic_2step": 0.0, "sequential_3step": 0.0, "conditional_routing": 0.0, "sequential_reasoning": 0.0, "error_recovery": 0.0, "data_gap_recovery": 0.0, "data_gap_recovery_extended": 0.0, "argument_transformation": 0.0, "grounded_synthesis": 0.0, "inconsistent_api_recovery": 0.0, "relevance_detection_stateful": 0.0, "argument_fidelity_stateful": 0.0, "tool_selection_stateful": 0.0, "basic_2step_stateful": 0.0, "sequential_3step_stateful": 0.0, "conditional_routing_stateful": 0.0, "sequential_reasoning_stateful": 0.0, "error_recovery_stateful": 0.0, "data_gap_recovery_stateful": 0.0, "data_gap_recovery_extended_stateful": 0.0, "argument_transformation_stateful": 0.0, "grounded_synthesis_stateful": 0.0, "inconsistent_api_recovery_stateful": 0.0}, "scenarioSpeedN": {"relevance_detection": 0, "argument_fidelity": 0, "tool_selection": 0, "basic_2step": 0, "sequential_3step": 0, "conditional_routing": 0, "sequential_reasoning": 0, "error_recovery": 0, "data_gap_recovery": 0, "data_gap_recovery_extended": 0, "argument_transformation": 0, "grounded_synthesis": 0, "inconsistent_api_recovery": 0, "relevance_detection_stateful": 0, "argument_fidelity_stateful": 0, "tool_selection_stateful": 0, "basic_2step_stateful": 0, "sequential_3step_stateful": 0, "conditional_routing_stateful": 0, "sequential_reasoning_stateful": 0, "error_recovery_stateful": 0, "data_gap_recovery_stateful": 0, "data_gap_recovery_extended_stateful": 0, "argument_transformation_stateful": 0, "grounded_synthesis_stateful": 0, "inconsistent_api_recovery_stateful": 0}}], "scenarios": ["relevance_detection", "argument_fidelity", "tool_selection", "basic_2step", "sequential_3step", "conditional_routing", "sequential_reasoning", "error_recovery", "data_gap_recovery", "data_gap_recovery_extended", "argument_transformation", "grounded_synthesis", "inconsistent_api_recovery", "relevance_detection_stateful", "argument_fidelity_stateful", "tool_selection_stateful", "basic_2step_stateful", "sequential_3step_stateful", "conditional_routing_stateful", "sequential_reasoning_stateful", "error_recovery_stateful", "data_gap_recovery_stateful", "data_gap_recovery_extended_stateful", "argument_transformation_stateful", "grounded_synthesis_stateful", "inconsistent_api_recovery_stateful"], "scenarioAbbrev": {"relevance_detection": "rel", "argument_fidelity": "arg", "tool_selection": "tsl", "basic_2step": "b2s", "sequential_3step": "s3s", "conditional_routing": "crt", "sequential_reasoning": "srn", "error_recovery": "err", "data_gap_recovery": "dgr", "data_gap_recovery_extended": "dge", "argument_transformation": "art", "grounded_synthesis": "grs", "inconsistent_api_recovery": "iar", "relevance_detection_stateful": "rel_s", "argument_fidelity_stateful": "arg_s", "tool_selection_stateful": "tsl_s", "basic_2step_stateful": "b2s_s", "sequential_3step_stateful": "s3s_s", "conditional_routing_stateful": "crt_s", "sequential_reasoning_stateful": "srn_s", "error_recovery_stateful": "err_s", "data_gap_recovery_stateful": "dgr_s", "data_gap_recovery_extended_stateful": "dge_s", "argument_transformation_stateful": "art_s", "grounded_synthesis_stateful": "grs_s", "inconsistent_api_recovery_stateful": "iar_s"}, "scenarioSuite": {"relevance_detection": "og18", "argument_fidelity": "og18", "tool_selection": "og18", "basic_2step": "og18", "sequential_3step": "og18", "conditional_routing": "og18", "sequential_reasoning": "og18", "error_recovery": "og18", "data_gap_recovery": "og18", "data_gap_recovery_extended": "advanced_reasoning", "argument_transformation": "advanced_reasoning", "grounded_synthesis": "advanced_reasoning", "inconsistent_api_recovery": "advanced_reasoning", "relevance_detection_stateful": "og18", "argument_fidelity_stateful": "og18", "tool_selection_stateful": "og18", "basic_2step_stateful": "og18", "sequential_3step_stateful": "og18", "conditional_routing_stateful": "og18", "sequential_reasoning_stateful": "og18", "error_recovery_stateful": "og18", "data_gap_recovery_stateful": "og18", "data_gap_recovery_extended_stateful": "advanced_reasoning", "argument_transformation_stateful": "advanced_reasoning", "grounded_synthesis_stateful": "advanced_reasoning", "inconsistent_api_recovery_stateful": "advanced_reasoning"}, "maxGen": 3, "genInfo": {"1": {"commit": "2b05dc4", "date": "2026-05-08", "note": "v0.6.0 suite \u2014 incl. Anthropic ablation"}, "2": {"commit": "655e1f6", "date": "2026-05-22", "note": "v0.7.0 lineup refresh (8\u201314B) + 32GB tier debut (v0.7.4)"}, "3": {"commit": "v0.7.5", "date": "2026-06-11", "note": "reasoning-replay grid (8\u201314B \u00d7 none/keep-last/full) + Claude thinking-on baseline"}}, "timestamp": "2026-06-11 20:28"};</script>
 </head>
   <body class="bg-zinc-950 text-zinc-100 min-h-screen">
     <div id="root"></div>
+
   </body>
 </html>
diff --git a/docs/results/index.md b/docs/results/index.md
index 0c4e176..7574620 100644
--- a/docs/results/index.md
+++ b/docs/results/index.md
@@ -15,5 +15,6 @@ For model and backend recommendations, see [Model Guide](../MODEL_GUIDE.md).
 ## Other cross-cuts
 
 - [native-vs-prompt.md](raw/native-vs-prompt.md) — llama-server native FC vs prompt-injected, reforged only
+- [reasoning-replay.md](raw/reasoning-replay.md) — reasoning_replay policy comparison (none / keep-last / full) per config
 
-*Generated 2026-06-03 00:09*
+*Generated 2026-06-11 20:28*
diff --git a/docs/results/raw/native-vs-prompt.md b/docs/results/raw/native-vs-prompt.md
index 7243f1b..6636389 100644
--- a/docs/results/raw/native-vs-prompt.md
+++ b/docs/results/raw/native-vs-prompt.md
@@ -6,20 +6,24 @@
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Model/Backend                                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [reforged]    78.1%    78.1%   100.0%    97%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100    16     0     0   100   100   100   100   100   100   100   100   100   100    14     0     0   100
-Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [reforged]    80.2%    80.2%   100.0%   100%   0.0   2.9s    50    100   100   100   100   100   100   100   100   100     0     0    36   100   100   100   100   100   100   100   100   100   100     0     0    50   100
+Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [reforged]    77.8%    77.8%   100.0%    97%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100     8     0     0   100   100   100   100   100   100   100   100   100   100    14     0     0   100
+Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [reforged]    80.6%    80.6%   100.0%   100%   0.0   3.0s    50    100   100   100   100   100   100   100   100   100     0     0    46   100   100   100   100   100   100   100   100   100    98     0     0    52   100
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Ministral-3-14B-Reasoning-2512-Q4_K_M
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged]    84.5%    84.5%   100.0%    97%   0.6   5.4s    50    100   100   100   100   100    88   100   100    70    44    48    76    94   100   100   100   100   100    96    98   100    76    38    26    62    82
-Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged]    79.5%    80.5%    98.7%    96%   0.5   3.7s    50    100   100   100   100   100    82   100   100    78    30     6    58    92   100   100   100   100   100    74   100   100    80    20     6    56    84
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged]              83.3%    83.3%   100.0%    96%   0.6   4.8s    50    100   100   100   100   100   100   100   100    60    32    34    78    94   100   100   100   100   100    92   100   100    62    30    28    78    78
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged:keep-last]    82.9%    82.9%   100.0%    95%   0.6   5.0s    50    100   100   100   100   100    92   100   100    68    40    30    60    94   100   100   100   100   100    96   100   100    62    36    20    72    86
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged:full]         83.2%    83.2%   100.0%    97%   0.6   5.6s    50    100   100   100   100   100    86    98   100    68    40    32    76    96   100   100   100   100   100    92   100   100    62    34    30    68    82
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged]              77.7%    78.8%    98.5%    95%   0.6   3.8s    50    100   100   100   100   100    74   100   100    78    32     2    46    90   100   100   100   100   100    66   100   100    74    28     4    48    78
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged:keep-last]    78.0%    78.8%    98.9%    95%   0.6   3.8s    50    100   100   100   100   100    82   100   100    70    28     2    52    78   100   100   100   100   100    74   100   100    80    18     4    48    92
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged:full]         81.2%    82.0%    98.9%    98%   0.6   3.8s    50    100   100   100   100   100    74   100   100    68    40     4    72    88   100   100   100   100   100    82   100   100    78    38    10    64    92
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Ministral-3-8B-Instruct-2512-Q4_K_M
@@ -28,8 +32,8 @@ Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged]    79.5%    80.5%    98.7%
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Model/Backend                                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [reforged]    78.3%    78.4%    99.8%    95%   0.4   3.2s    50    100   100   100   100   100   100   100    98   100    22     0     0   100   100   100   100   100   100   100   100    98   100    14     2     2   100
-Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [reforged]    75.6%    83.8%    90.2%    79%   1.3   3.0s    50     98   100     0   100   100   100   100   100   100    22    12    56   100   100   100     0   100   100   100   100   100   100    28     0    50   100
+Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [reforged]    78.3%    78.3%   100.0%    95%   0.4   3.2s    50    100   100   100   100   100   100   100   100   100    18     0     0   100   100   100   100   100   100   100   100   100   100    16     2     0   100
+Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [reforged]    74.9%    83.3%    89.9%    79%   1.3   3.1s    50    100   100     0   100   100   100   100   100   100    20     2    42   100   100   100     0   100   100   100   100   100   100    22     0    62   100
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
@@ -39,163 +43,193 @@ Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [reforged]    75.6%    83.8%    90.2%
 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Model/Backend                                          Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Instruct-2512-Q8_0 LS/N [reforged]    81.4%    81.4%   100.0%   100%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100    38     4     0   100   100   100   100   100   100   100   100   100   100    74     0     0   100
-Ministral-3-8B-Instruct-2512-Q8_0 LS/P [reforged]    84.4%    91.1%    92.6%    92%   0.7   4.6s    50    100   100     6   100   100   100   100   100   100    98     8   100    80   100   100     4   100   100   100   100   100   100   100     0    98   100
+Ministral-3-8B-Instruct-2512-Q8_0 LS/N [reforged]    81.0%    81.0%   100.0%   100%   0.3   4.1s    50    100   100   100   100   100   100   100   100   100    30     0     4   100   100   100   100   100   100   100   100   100   100    68     0     4   100
+Ministral-3-8B-Instruct-2512-Q8_0 LS/P [reforged]    84.2%    90.9%    92.7%    91%   0.7   4.6s    50    100   100     6   100   100   100   100   100   100    98     8   100    80   100   100     4   100   100   100   100   100   100    96     0    98   100
 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Ministral-3-8B-Reasoning-2512-Q4_K_M
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged]    82.8%    82.8%    99.9%    95%   0.5   4.1s    50    100   100   100   100   100   100   100   100    98    66    24    34    92   100   100   100   100   100    96   100   100   100    70     0    30    42
-Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged]    80.5%    83.0%    97.0%    96%   0.7   2.7s    50    100    98    98   100   100    98    98   100    96    74    10    36    70   100    98    96   100   100    98   100    98    94    70     2    38    20
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                       Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged]              81.4%    81.6%    99.7%    94%   0.6   3.8s    50    100   100    98   100   100   100   100   100   100    68    24    18    86   100   100    98   100   100   100   100   100    96    70     4    28    26
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged:keep-last]    82.4%    83.0%    99.3%    92%   0.6   4.2s    50    100   100   100   100   100    98   100   100   100    68    16    28    86   100   100   100   100   100    96    98   100    96    64     6    24    62
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged:full]         81.8%    81.8%   100.0%    95%   0.5   4.3s    50    100   100    98   100    98    98    98   100    98    62     8    30    96   100   100    98   100   100    98    96   100    98    62     2    40    46
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged]              79.5%    81.9%    97.0%    94%   0.7   2.8s    50    100   100    98   100   100    96   100    90    96    68     4    32    68   100    98    98   100   100    96    96    94    98    70     0    34    30
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged:keep-last]    79.8%    82.6%    96.7%    95%   0.6   2.8s    50    100   100    96   100   100    98    98    94    98    66     8    36    64   100   100    96   100   100   100    98    96    94    76     0    34    24
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged:full]         81.0%    83.1%    97.5%    95%   0.7   3.0s    50    100    98    88   100   100    96   100    98    98    84    24    38    70   100   100   100   100   100    96    98   100    96    72     2    22    26
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Ministral-3-8B-Reasoning-2512-Q8_0
 
 ```
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                           Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged]    84.2%    84.2%   100.0%    95%   0.5   6.0s    50    100   100   100   100   100   100   100   100    98    74    26    54    88   100   100   100   100   100   100   100   100   100    68     2    26    52
-Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged]    81.3%    85.0%    95.7%    96%   0.7   3.9s    50    100   100    96   100   100    98   100    92    98    56    14    46    70   100   100    88   100   100    98   100    94   100    82     0    48    34
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                     Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged]              84.5%    84.8%    99.7%    96%   0.6   5.3s    50    100   100    96   100   100    98   100   100   100    76    18    44    92   100   100   100   100   100   100    98   100    98    80     2    42    54
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged:keep-last]    84.8%    85.6%    99.1%    94%   0.6   5.9s    50    100   100   100   100   100   100   100   100    98    70    24    42    86   100   100   100   100   100   100   100   100   100    82     6    36    62
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged:full]         83.1%    83.1%    99.9%    96%   0.5   6.0s    50    100   100   100   100   100   100    98   100   100    66    20    36    88   100   100   100   100   100   100   100   100    98    74     2    26    52
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged]              80.9%    84.8%    95.4%    95%   0.7   3.9s    50    100   100    90   100   100    98   100    98   100    72     6    50    76   100   100    86   100   100   100   100    92    96    64     2    42    32
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged:keep-last]    81.2%    84.6%    96.0%    97%   0.7   4.1s    50    100   100    94   100   100    94   100    92    98    72    12    42    78   100   100    86   100   100    98   100    98   100    64     0    52    32
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged:full]         81.4%    84.8%    96.0%    95%   0.7   4.0s    50    100   100    88   100   100    94   100    96   100    72    12    58    80   100    98    88   100   100    96   100    92   100    64     2    36    40
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/N [reforged]    71.0%    71.2%    99.8%    96%   0.5   6.5s    50    100   100   100   100    98    58   100   100    68     4    12    20    90   100   100   100   100   100    50   100   100     4    22     0    20   100
-Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [reforged]    78.2%    84.3%    92.7%    78%   1.1   3.6s    50    100   100   100   100   100   100   100   100    28     0     0   100    94   100   100   100   100   100   100   100   100    34     0     0   100    76
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                         Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/N [reforged:full]²    71.0%    71.2%    99.8%    96%   0.5   6.5s    50    100   100   100   100    98    58   100   100    68     4    12    20    90   100   100   100   100   100    50   100   100     4    22     0    20   100
+Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [reforged:full]²    78.2%    84.3%    92.7%    78%   1.1   3.6s    50    100   100   100   100   100   100   100   100    28     0     0   100    94   100   100   100   100   100   100   100   100    34     0     0   100    76
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Nemotron-3-Nano-30B-A3B-Q4_K_M
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                       Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Nemotron-3-Nano-30B-A3B-Q4_K_M LS/N [reforged]    71.3%    81.0%    88.0%    72%   1.5  21.4s    50    100   100   100   100   100    66    98    52    92    28     4    34    34   100   100   100    98   100    86    92    68    98    24     8    34    38
-Nemotron-3-Nano-30B-A3B-Q4_K_M LS/P [reforged]    70.2%    70.7%    99.4%    89%   0.4  10.8s    50    100   100   100   100    98    52   100    84    90     6     4     0   100   100   100   100   100   100    42   100    80    92     6     2     4    66
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Nemotron-3-Nano-30B-A3B-Q4_K_M LS/N [reforged:full]²    71.3%    81.0%    88.0%    72%   1.5  21.4s    50    100   100   100   100   100    66    98    52    92    28     4    34    34   100   100   100    98   100    86    92    68    98    24     8    34    38
+Nemotron-3-Nano-30B-A3B-Q4_K_M LS/P [reforged:full]²    70.2%    70.7%    99.4%    89%   0.4  10.8s    50    100   100   100   100    98    52   100    84    90     6     4     0   100   100   100   100   100   100    42   100    80    92     6     2     4    66
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3-14B-Q4_K_M
 
 ```
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                         Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3-14B-Q4_K_M LS/N [reforged]    67.7%    67.7%    99.9%    85%   0.9  20.8s    50    100   100   100   100   100    94   100    62    36     4    22    44    22   100   100   100   100    98    84   100    66    24    12    18    42    32
-Qwen3-14B-Q4_K_M LS/P [reforged]    70.5%    70.8%    99.7%    86%   0.5  24.2s    50    100   100   100   100   100    94   100    64    68     0     0    32    72   100   100   100   100    98    94   100    58    66     0     0    30    58
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-14B-Q4_K_M LS/N [reforged]              68.5%    68.5%   100.0%   100%   0.4  21.8s    50    100   100   100   100    98   100   100    56    72    14     4    58     4   100   100   100   100   100   100   100    42    72    10     0    52     0
+Qwen3-14B-Q4_K_M LS/N [reforged:keep-last]    64.0%    64.0%    99.9%    91%   0.6  20.3s    50    100   100   100   100   100    90    98    48    40    18     6    38     4   100   100   100    98    96    88   100    54    30    16     2    38     0
+Qwen3-14B-Q4_K_M LS/N [reforged:full]         68.4%    68.4%    99.9%    83%   0.9  21.9s    50    100   100   100   100    98    90    98    60    32    20    18    50    38   100   100   100   100   100    86   100    74    34     6    18    34    22
+Qwen3-14B-Q4_K_M LS/P [reforged]              71.4%    71.4%    99.9%    86%   0.5  25.6s    50    100   100   100   100   100    98   100    70    70     0     0    30    80   100   100   100   100   100    92   100    76    56     0     0    22    62
+Qwen3-14B-Q4_K_M LS/P [reforged:keep-last]    71.8%    71.8%   100.0%    86%   0.5  23.8s    50    100   100   100   100   100    98   100    72    58     2     4    28    72   100   100   100   100   100    94   100    76    74     0     0    32    56
+Qwen3-14B-Q4_K_M LS/P [reforged:full]         71.8%    71.9%    99.8%    87%   0.5  24.3s    50    100   100   100   100   100    96   100    72    72     2     0    30    74   100   100   100   100   100    92   100    74    68     0     0    38    48
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3-8B-Q4_K_M
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3-8B-Q4_K_M LS/N [reforged]    68.2%    68.4%    99.6%    86%   0.7  16.1s    50     98   100   100   100   100    92   100    48    78     0    44     8    38   100   100   100   100   100    90   100    40    76     0     8    14    38
-Qwen3-8B-Q4_K_M LS/P [reforged]    70.4%    70.7%    99.6%    86%   0.5  17.8s    50    100   100   100   100   100    94   100    56    64     0    14    12    92   100   100   100   100   100    94   100    62    58     0     0     6    78
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q4_K_M LS/N [reforged]              67.3%    67.5%    99.7%    96%   0.3  15.6s    50    100   100   100   100   100   100   100    40    98     6    14    22     2   100   100   100   100   100   100   100    48    86     2     0    26     6
+Qwen3-8B-Q4_K_M LS/N [reforged:keep-last]    64.5%    64.6%    99.9%    91%   0.4  15.0s    50    100   100   100   100   100   100   100    30    82     0    28    12    10   100   100   100    98   100    96   100    22    86     2     2     6     4
+Qwen3-8B-Q4_K_M LS/N [reforged:full]         65.8%    66.0%    99.7%    84%   0.7  17.2s    50    100   100   100   100   100    94   100    34    66     0    18    10    38   100   100   100    96   100    86   100    34    74     0    10    12    40
+Qwen3-8B-Q4_K_M LS/P [reforged]              71.1%    71.2%    99.8%    87%   0.5  18.0s    50    100   100   100   100   100    96   100    70    66     0    18     8    96   100   100   100   100    98    96   100    60    60     0     0    10    70
+Qwen3-8B-Q4_K_M LS/P [reforged:keep-last]    72.2%    72.3%    99.9%    88%   0.5  17.9s    50    100   100   100   100   100   100   100    62    66     0    30     8    90   100   100   100   100   100    98   100    74    68     0     0     8    74
+Qwen3-8B-Q4_K_M LS/P [reforged:full]         70.5%    70.8%    99.6%    88%   0.4  17.4s    50    100   100   100   100   100    88   100    58    66     0    24    10    88   100   100   100   100   100    94    98    62    66     0     0     4    76
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3-8B-Q8_0
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                      Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3-8B-Q8_0 LS/N [reforged]    70.3%    70.5%    99.7%    88%   0.6  24.1s    50    100   100   100   100   100   100   100    60    82     4    22    20    32   100   100   100   100    98    94   100    58    66     2    12    28    50
-Qwen3-8B-Q8_0 LS/P [reforged]    73.1%    73.2%    99.8%    89%   0.4  28.4s    50    100   100   100   100   100   100   100    58    96     0     8    28    94   100   100   100   100    96   100    98    64    88     0     0    12    58
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q8_0 LS/N [reforged]              68.2%    68.5%    99.5%    95%   0.3  24.8s    50    100   100   100   100   100   100   100    56    78     6    24    28     6    98   100   100   100   100   100   100    52    84     6     2    30     2
+Qwen3-8B-Q8_0 LS/N [reforged:keep-last]    67.0%    67.2%    99.8%    92%   0.4  23.2s    50    100   100   100   100   100    94    98    48    84     2    22    10    12   100   100   100   100   100    96   100    52    80     0    18    20     6
+Qwen3-8B-Q8_0 LS/N [reforged:full]         69.3%    69.6%    99.6%    88%   0.6  24.7s    50     98   100   100   100   100    94   100    48    76     2    28    24    46   100   100   100   100   100    98   100    46    76     2     8    16    40
+Qwen3-8B-Q8_0 LS/P [reforged]              72.0%    72.3%    99.6%    88%   0.4  28.6s    50    100   100   100   100   100    96   100    56    90     0     2    10    96   100   100   100   100    96    98   100    58    88     0     0    20    62
+Qwen3-8B-Q8_0 LS/P [reforged:keep-last]    72.8%    73.0%    99.7%    89%   0.4  28.0s    50    100   100   100   100   100    98    98    80    90     0     6     8    94   100   100   100   100    96    96   100    60    98     0     2    12    54
+Qwen3-8B-Q8_0 LS/P [reforged:full]         72.8%    72.9%    99.8%    88%   0.4  28.9s    50    100   100   100   100   100    98   100    70    90     0     4    20    96   100   100   100   100    92   100    96    66    92     0     0    12    56
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3.5-27B-Q4_K_M
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                           Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.5-27B-Q4_K_M LS/N [reforged]    93.2%    93.3%    99.8%    82%   1.4  37.6s    50    100   100   100   100   100   100   100    98   100    74    38    88    98   100   100   100   100   100    98   100   100   100    78    56    96    98
-Qwen3.5-27B-Q4_K_M LS/P [reforged]    86.8%    86.8%   100.0%   100%   0.1  24.4s    50    100   100   100   100   100   100   100   100   100    42    10    78   100   100   100   100   100   100   100   100   100   100    36    10    80   100
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                 Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.5-27B-Q4_K_M LS/N [reforged:full]²    93.2%    93.3%    99.8%    82%   1.4  37.6s    50    100   100   100   100   100   100   100    98   100    74    38    88    98   100   100   100   100   100    98   100   100   100    78    56    96    98
+Qwen3.5-27B-Q4_K_M LS/P [reforged:full]²    86.8%    86.8%   100.0%   100%   0.1  24.4s    50    100   100   100   100   100   100   100   100   100    42    10    78   100   100   100   100   100   100   100   100   100   100    36    10    80   100
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3.5-35B-A3B-Q4_K_M
 
 ```
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                               Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.5-35B-A3B-Q4_K_M LS/N [reforged]    92.1%    92.4%    99.7%    82%   1.3  11.1s    50    100   100   100   100   100    96    98   100   100    96    14    84   100   100   100   100   100   100    96   100   100   100    94    20    96   100
-Qwen3.5-35B-A3B-Q4_K_M LS/P [reforged]    82.8%    82.8%   100.0%   100%   0.2  10.4s    50     48   100   100   100   100    94    98   100   100    74    16    62    90    56   100   100   100   100    96   100   100    98    68    14    58    82
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                     Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.5-35B-A3B-Q4_K_M LS/N [reforged:full]²    92.1%    92.4%    99.7%    82%   1.3  11.1s    50    100   100   100   100   100    96    98   100   100    96    14    84   100   100   100   100   100   100    96   100   100   100    94    20    96   100
+Qwen3.5-35B-A3B-Q4_K_M LS/P [reforged:full]²    82.8%    82.8%   100.0%   100%   0.2  10.4s    50     48   100   100   100   100    94    98   100   100    74    16    62    90    56   100   100   100   100    96   100   100    98    68    14    58    82
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3.6-27B-Q4_K_M
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                           Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.6-27B-Q4_K_M LS/N [reforged]    92.2%    92.5%    99.6%   100%   0.4  37.9s    50    100   100   100   100   100   100   100    98   100    22    74    98   100   100   100   100   100   100   100   100    96    98    36    78    96   100
-Qwen3.6-27B-Q4_K_M LS/P [reforged]    83.5%    85.0%    98.2%    97%   0.4  53.9s    50    100   100   100   100   100   100   100   100    98     6    66    52    90   100   100   100   100   100    98   100    96    90     2    56    36    80
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                 Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.6-27B-Q4_K_M LS/N [reforged:full]²    92.2%    92.5%    99.6%   100%   0.4  37.9s    50    100   100   100   100   100   100   100    98   100    22    74    98   100   100   100   100   100   100   100   100    96    98    36    78    96   100
+Qwen3.6-27B-Q4_K_M LS/P [reforged:full]²    83.5%    85.0%    98.2%    97%   0.4  53.9s    50    100   100   100   100   100   100   100   100    98     6    66    52    90   100   100   100   100   100    98   100    96    90     2    56    36    80
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3.6-35B-A3B-UD-Q4_K_M
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.6-35B-A3B-UD-Q4_K_M LS/N [reforged]    94.8%    95.1%    99.7%   100%   0.6  12.7s    50    100   100   100   100   100   100    96   100   100    72    78    92   100    98   100   100   100   100    98    92   100   100    68    76    94   100
-Qwen3.6-35B-A3B-UD-Q4_K_M LS/P [reforged]    82.2%    82.2%   100.0%   100%   0.3  23.6s    50     96   100   100   100   100    90    92    98    92    16    46    62    98    88   100   100   100   100    88    94    96    88     8    42    50    94
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.6-35B-A3B-UD-Q4_K_M LS/N [reforged:full]²    94.8%    95.1%    99.7%   100%   0.6  12.7s    50    100   100   100   100   100   100    96   100   100    72    78    92   100    98   100   100   100   100    98    92   100   100    68    76    94   100
+Qwen3.6-35B-A3B-UD-Q4_K_M LS/P [reforged:full]²    82.2%    82.2%   100.0%   100%   0.3  23.6s    50     96   100   100   100   100    90    92    98    92    16    46    62    98    88   100   100   100   100    88    94    96    88     8    42    50    94
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## gemma-4-E4B-it-Q4_K_M
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-gemma-4-E4B-it-Q4_K_M LS/N [reforged]    78.2%    82.2%    95.1%    98%   0.5   9.0s    50    100   100   100   100   100    92    98    98    90     0    24    80    50   100   100   100   100   100    94    90    94    98     0     0    84    40
-gemma-4-E4B-it-Q4_K_M LS/P [reforged]    72.8%    72.8%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    38    98    94    94     0    18    26    96   100   100   100   100   100    26    98   100    92     0     2    22    90
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q4_K_M LS/N [reforged]              79.7%    79.9%    99.8%   100%   0.3   8.1s    50    100   100   100   100   100    94    98   100    84     8    30    64    92   100   100   100   100   100    88    94   100    86     2     0    48    84
+gemma-4-E4B-it-Q4_K_M LS/N [reforged:keep-last]    78.7%    81.4%    96.7%    99%   0.5   9.3s    50    100   100   100   100   100    96    92    96    96     6    20    64    62   100   100   100   100   100    88    88   100    98     2     0    82    56
+gemma-4-E4B-it-Q4_K_M LS/N [reforged:full]         79.2%    82.2%    96.3%    99%   0.5  10.0s    50    100   100   100   100   100    96    88    94   100     2    40    78    54   100   100   100   100   100    94    84    98    96     0     0    82    52
+gemma-4-E4B-it-Q4_K_M LS/P [reforged]              72.4%    72.4%    99.9%    85%   0.6   8.9s    50    100   100   100   100   100    56    96   100    86     0    14    28    84   100   100   100   100   100    34   100    96    74     0     0    22    92
+gemma-4-E4B-it-Q4_K_M LS/P [reforged:keep-last]    73.2%    73.2%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    40    98    96    92     0    10    24    96   100   100   100   100   100    38   100   100    86     0     0    32    92
+gemma-4-E4B-it-Q4_K_M LS/P [reforged:full]         72.9%    72.9%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    54    98    98    86     0    10    28    88   100   100   100   100   100    26    98    96    94     0     0    30    90
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## gemma-4-E4B-it-Q8_0
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-gemma-4-E4B-it-Q8_0 LS/N [reforged]    76.2%    80.7%    94.5%    98%   0.6  12.8s    50    100   100   100   100   100    84    88    90    96     2    14    80    44   100   100   100   100   100    88    90    96    94     4     0    80    32
-gemma-4-E4B-it-Q8_0 LS/P [reforged]    74.7%    74.7%   100.0%    85%   0.6  12.7s    50    100   100   100   100   100    70   100    90    88     0    16    34    94   100   100   100   100   100    48   100    98    84     0     0    30    90
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                      Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q8_0 LS/N [reforged]              77.8%    77.9%    99.8%   100%   0.2  10.8s    50    100   100   100   100   100    76    90   100   100     0    18    38    98   100   100   100   100   100    76    98   100    98     0     0    36    94
+gemma-4-E4B-it-Q8_0 LS/N [reforged:keep-last]    75.7%    79.3%    95.5%    99%   0.5  13.1s    50    100   100   100   100   100    76    92    98    96     4    16    82    48   100   100   100   100   100    50    86    98    94     2     0    86    40
+gemma-4-E4B-it-Q8_0 LS/N [reforged:full]         75.6%    80.8%    93.6%    98%   0.6  12.5s    50    100   100   100   100   100    92    94    92    88     0    18    82    28   100   100   100   100   100    76    92    94    98     2     0    84    26
+gemma-4-E4B-it-Q8_0 LS/P [reforged]              73.2%    73.3%    99.8%    85%   0.6  13.3s    50    100   100   100   100   100    54    98    94    80     0    28    20    94   100   100   100   100   100    40   100    98    90     0     0    12    96
+gemma-4-E4B-it-Q8_0 LS/P [reforged:keep-last]    74.1%    74.2%    99.8%    85%   0.6  13.4s    50    100   100   100   100   100    48   100    96    84     0    22    28    94   100   100   100   100   100    52   100    98    90     0     0    18    96
+gemma-4-E4B-it-Q8_0 LS/P [reforged:full]         73.7%    73.7%   100.0%    86%   0.6  12.7s    50    100   100   100   100   100    48   100    92    90     0    36    28    88   100   100   100   100   100    48    94    98    82     0     0    20    92
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## granite-4.1-8b-Q4_K_M
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-granite-4.1-8b-Q4_K_M LS/N [reforged]    65.4%    68.0%    96.2%    90%   0.8   1.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
-granite-4.1-8b-Q4_K_M LS/P [reforged]    61.5%    61.5%   100.0%    90%   0.3   2.5s    50    100   100   100   100   100     0   100   100     0     0   100     0     0   100   100   100   100   100     0   100   100     0     0   100     0     0
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+granite-4.1-8b-Q4_K_M LS/N [reforged]              65.4%    68.0%    96.2%    90%   0.8   1.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [reforged:keep-last]    65.4%    68.0%    96.2%    90%   0.8   1.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [reforged:full]         65.4%    68.0%    96.2%    90%   0.8   1.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/P [reforged]              61.5%    61.5%   100.0%    90%   0.3   2.5s    50    100   100   100   100   100     0   100   100     0     0   100     0     0   100   100   100   100   100     0   100   100     0     0   100     0     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## granite-4.1-8b-Q8_0
@@ -204,7 +238,7 @@ granite-4.1-8b-Q4_K_M LS/P [reforged]    61.5%    61.5%   100.0%    90%   0.3
 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Model/Backend                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-granite-4.1-8b-Q8_0 LS/N [reforged]    65.4%    65.4%   100.0%    88%   1.4   2.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q8_0 LS/N [reforged]    65.4%    65.4%   100.0%    88%   1.3   2.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
 granite-4.1-8b-Q8_0 LS/P [reforged]    61.5%    66.7%    92.3%    73%   1.0   5.2s    50      0   100   100   100   100   100   100   100     0     0     0   100     0     0   100   100   100   100   100   100   100     0     0     0   100     0
 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
@@ -212,8 +246,10 @@ granite-4.1-8b-Q8_0 LS/P [reforged]    61.5%    66.7%    92.3%    73%   1.0   5.
 Scr=score(correct/total), Acc=accuracy(correct/total, excl validate errors), Cmp=completeness(completed/total), Eff=efficiency(ideal/actual calls), Wst=avg wasted calls, Spd=avg time(excl compaction)
 rel=relevance_detection, arg=argument_fidelity, tsl=tool_selection, b2s=basic_2step, s3s=sequential_3step, crt=conditional_routing, srn=sequential_reasoning, err=error_recovery, dgr=data_gap_recovery, dge=data_gap_recovery_extended, art=argument_transformation, grs=grounded_synthesis, iar=inconsistent_api_recovery, rel_s=relevance_detection_stateful, arg_s=argument_fidelity_stateful, tsl_s=tool_selection_stateful, b2s_s=basic_2step_stateful, s3s_s=sequential_3step_stateful, crt_s=conditional_routing_stateful, srn_s=sequential_reasoning_stateful, err_s=error_recovery_stateful, dgr_s=data_gap_recovery_stateful, dge_s=data_gap_recovery_extended_stateful, art_s=argument_transformation_stateful, grs_s=grounded_synthesis_stateful, iar_s=inconsistent_api_recovery_stateful
 Ablation: full=all guardrails, no_rescue=no rescue loop, no_nudge=no rescue/retry nudge, no_steps=no step enforcement, no_recovery=no error recovery, no_compact=no compaction, bare=all guardrails off
+Replay: ':keep-last'/':full' tags = reasoning_replay policy (how much captured reasoning is re-sent to the backend each turn); untagged = none (default). Rows predating the knob ran unbounded replay and count as full.
 
 Eval generations (older runs carried forward, superscript-tagged):
   ¹ gen 1 — v0.6.0 suite — incl. Anthropic ablation (commit 2b05dc4, 2026-05-08)
+  ² gen 2 — v0.7.0 lineup refresh (8–14B) + 32GB tier debut (v0.7.4) (commit 655e1f6, 2026-05-22)
 
-*Generated 2026-06-03 00:09*
+*Generated 2026-06-11 20:28*
diff --git a/docs/results/raw/reasoning-replay.md b/docs/results/raw/reasoning-replay.md
new file mode 100644
index 0000000..ae2f00e
--- /dev/null
+++ b/docs/results/raw/reasoning-replay.md
@@ -0,0 +1,420 @@
+# Forge Eval — Reasoning Replay Policies
+
+## Ministral-3-8B-Reasoning-2512-Q8_0 (llamaserver/native) [reforged]
+
+```
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                     Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged]              84.5%    84.8%    99.7%    96%   0.6   5.3s    50    100   100    96   100   100    98   100   100   100    76    18    44    92   100   100   100   100   100   100    98   100    98    80     2    42    54
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged:keep-last]    84.8%    85.6%    99.1%    94%   0.6   5.9s    50    100   100   100   100   100   100   100   100    98    70    24    42    86   100   100   100   100   100   100   100   100   100    82     6    36    62
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged:full]         83.1%    83.1%    99.9%    96%   0.5   6.0s    50    100   100   100   100   100   100    98   100   100    66    20    36    88   100   100   100   100   100   100   100   100    98    74     2    26    52
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Ministral-3-14B-Reasoning-2512-Q4_K_M (llamaserver/native) [reforged]
+
+```
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged]              83.3%    83.3%   100.0%    96%   0.6   4.8s    50    100   100   100   100   100   100   100   100    60    32    34    78    94   100   100   100   100   100    92   100   100    62    30    28    78    78
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged:keep-last]    82.9%    82.9%   100.0%    95%   0.6   5.0s    50    100   100   100   100   100    92   100   100    68    40    30    60    94   100   100   100   100   100    96   100   100    62    36    20    72    86
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged:full]         83.2%    83.2%   100.0%    97%   0.6   5.6s    50    100   100   100   100   100    86    98   100    68    40    32    76    96   100   100   100   100   100    92   100   100    62    34    30    68    82
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Ministral-3-8B-Reasoning-2512-Q4_K_M (llamaserver/native) [reforged]
+
+```
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                       Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged]              81.4%    81.6%    99.7%    94%   0.6   3.8s    50    100   100    98   100   100   100   100   100   100    68    24    18    86   100   100    98   100   100   100   100   100    96    70     4    28    26
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged:keep-last]    82.4%    83.0%    99.3%    92%   0.6   4.2s    50    100   100   100   100   100    98   100   100   100    68    16    28    86   100   100   100   100   100    96    98   100    96    64     6    24    62
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged:full]         81.8%    81.8%   100.0%    95%   0.5   4.3s    50    100   100    98   100    98    98    98   100    98    62     8    30    96   100   100    98   100   100    98    96   100    98    62     2    40    46
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Ministral-3-8B-Reasoning-2512-Q8_0 (llamaserver/prompt) [reforged]
+
+```
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                     Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged]              80.9%    84.8%    95.4%    95%   0.7   3.9s    50    100   100    90   100   100    98   100    98   100    72     6    50    76   100   100    86   100   100   100   100    92    96    64     2    42    32
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged:keep-last]    81.2%    84.6%    96.0%    97%   0.7   4.1s    50    100   100    94   100   100    94   100    92    98    72    12    42    78   100   100    86   100   100    98   100    98   100    64     0    52    32
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged:full]         81.4%    84.8%    96.0%    95%   0.7   4.0s    50    100   100    88   100   100    94   100    96   100    72    12    58    80   100    98    88   100   100    96   100    92   100    64     2    36    40
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Ministral-3-14B-Reasoning-2512-Q4_K_M (llamaserver/prompt) [reforged]
+
+```
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged]              77.7%    78.8%    98.5%    95%   0.6   3.8s    50    100   100   100   100   100    74   100   100    78    32     2    46    90   100   100   100   100   100    66   100   100    74    28     4    48    78
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged:keep-last]    78.0%    78.8%    98.9%    95%   0.6   3.8s    50    100   100   100   100   100    82   100   100    70    28     2    52    78   100   100   100   100   100    74   100   100    80    18     4    48    92
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged:full]         81.2%    82.0%    98.9%    98%   0.6   3.8s    50    100   100   100   100   100    74   100   100    68    40     4    72    88   100   100   100   100   100    82   100   100    78    38    10    64    92
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Ministral-3-8B-Reasoning-2512-Q4_K_M (llamaserver/prompt) [reforged]
+
+```
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                       Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged]              79.5%    81.9%    97.0%    94%   0.7   2.8s    50    100   100    98   100   100    96   100    90    96    68     4    32    68   100    98    98   100   100    96    96    94    98    70     0    34    30
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged:keep-last]    79.8%    82.6%    96.7%    95%   0.6   2.8s    50    100   100    96   100   100    98    98    94    98    66     8    36    64   100   100    96   100   100   100    98    96    94    76     0    34    24
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged:full]         81.0%    83.1%    97.5%    95%   0.7   3.0s    50    100    98    88   100   100    96   100    98    98    84    24    38    70   100   100   100   100   100    96    98   100    96    72     2    22    26
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## gemma-4-E4B-it-Q4_K_M (llamaserver/native) [reforged]
+
+```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q4_K_M LS/N [reforged]              79.7%    79.9%    99.8%   100%   0.3   8.1s    50    100   100   100   100   100    94    98   100    84     8    30    64    92   100   100   100   100   100    88    94   100    86     2     0    48    84
+gemma-4-E4B-it-Q4_K_M LS/N [reforged:keep-last]    78.7%    81.4%    96.7%    99%   0.5   9.3s    50    100   100   100   100   100    96    92    96    96     6    20    64    62   100   100   100   100   100    88    88   100    98     2     0    82    56
+gemma-4-E4B-it-Q4_K_M LS/N [reforged:full]         79.2%    82.2%    96.3%    99%   0.5  10.0s    50    100   100   100   100   100    96    88    94   100     2    40    78    54   100   100   100   100   100    94    84    98    96     0     0    82    52
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## gemma-4-E4B-it-Q8_0 (llamaserver/native) [reforged]
+
+```
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                      Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q8_0 LS/N [reforged]              77.8%    77.9%    99.8%   100%   0.2  10.8s    50    100   100   100   100   100    76    90   100   100     0    18    38    98   100   100   100   100   100    76    98   100    98     0     0    36    94
+gemma-4-E4B-it-Q8_0 LS/N [reforged:keep-last]    75.7%    79.3%    95.5%    99%   0.5  13.1s    50    100   100   100   100   100    76    92    98    96     4    16    82    48   100   100   100   100   100    50    86    98    94     2     0    86    40
+gemma-4-E4B-it-Q8_0 LS/N [reforged:full]         75.6%    80.8%    93.6%    98%   0.6  12.5s    50    100   100   100   100   100    92    94    92    88     0    18    82    28   100   100   100   100   100    76    92    94    98     2     0    84    26
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## gemma-4-E4B-it-Q8_0 (llamaserver/prompt) [reforged]
+
+```
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                      Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q8_0 LS/P [reforged]              73.2%    73.3%    99.8%    85%   0.6  13.3s    50    100   100   100   100   100    54    98    94    80     0    28    20    94   100   100   100   100   100    40   100    98    90     0     0    12    96
+gemma-4-E4B-it-Q8_0 LS/P [reforged:keep-last]    74.1%    74.2%    99.8%    85%   0.6  13.4s    50    100   100   100   100   100    48   100    96    84     0    22    28    94   100   100   100   100   100    52   100    98    90     0     0    18    96
+gemma-4-E4B-it-Q8_0 LS/P [reforged:full]         73.7%    73.7%   100.0%    86%   0.6  12.7s    50    100   100   100   100   100    48   100    92    90     0    36    28    88   100   100   100   100   100    48    94    98    82     0     0    20    92
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## gemma-4-E4B-it-Q4_K_M (llamaserver/prompt) [reforged]
+
+```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q4_K_M LS/P [reforged]              72.4%    72.4%    99.9%    85%   0.6   8.9s    50    100   100   100   100   100    56    96   100    86     0    14    28    84   100   100   100   100   100    34   100    96    74     0     0    22    92
+gemma-4-E4B-it-Q4_K_M LS/P [reforged:keep-last]    73.2%    73.2%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    40    98    96    92     0    10    24    96   100   100   100   100   100    38   100   100    86     0     0    32    92
+gemma-4-E4B-it-Q4_K_M LS/P [reforged:full]         72.9%    72.9%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    54    98    98    86     0    10    28    88   100   100   100   100   100    26    98    96    94     0     0    30    90
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Qwen3-8B-Q8_0 (llamaserver/prompt) [reforged]
+
+```
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q8_0 LS/P [reforged]              72.0%    72.3%    99.6%    88%   0.4  28.6s    50    100   100   100   100   100    96   100    56    90     0     2    10    96   100   100   100   100    96    98   100    58    88     0     0    20    62
+Qwen3-8B-Q8_0 LS/P [reforged:keep-last]    72.8%    73.0%    99.7%    89%   0.4  28.0s    50    100   100   100   100   100    98    98    80    90     0     6     8    94   100   100   100   100    96    96   100    60    98     0     2    12    54
+Qwen3-8B-Q8_0 LS/P [reforged:full]         72.8%    72.9%    99.8%    88%   0.4  28.9s    50    100   100   100   100   100    98   100    70    90     0     4    20    96   100   100   100   100    92   100    96    66    92     0     0    12    56
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Qwen3-8B-Q4_K_M (llamaserver/prompt) [reforged]
+
+```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q4_K_M LS/P [reforged]              71.1%    71.2%    99.8%    87%   0.5  18.0s    50    100   100   100   100   100    96   100    70    66     0    18     8    96   100   100   100   100    98    96   100    60    60     0     0    10    70
+Qwen3-8B-Q4_K_M LS/P [reforged:keep-last]    72.2%    72.3%    99.9%    88%   0.5  17.9s    50    100   100   100   100   100   100   100    62    66     0    30     8    90   100   100   100   100   100    98   100    74    68     0     0     8    74
+Qwen3-8B-Q4_K_M LS/P [reforged:full]         70.5%    70.8%    99.6%    88%   0.4  17.4s    50    100   100   100   100   100    88   100    58    66     0    24    10    88   100   100   100   100   100    94    98    62    66     0     0     4    76
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Qwen3-14B-Q4_K_M (llamaserver/prompt) [reforged]
+
+```
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-14B-Q4_K_M LS/P [reforged]              71.4%    71.4%    99.9%    86%   0.5  25.6s    50    100   100   100   100   100    98   100    70    70     0     0    30    80   100   100   100   100   100    92   100    76    56     0     0    22    62
+Qwen3-14B-Q4_K_M LS/P [reforged:keep-last]    71.8%    71.8%   100.0%    86%   0.5  23.8s    50    100   100   100   100   100    98   100    72    58     2     4    28    72   100   100   100   100   100    94   100    76    74     0     0    32    56
+Qwen3-14B-Q4_K_M LS/P [reforged:full]         71.8%    71.9%    99.8%    87%   0.5  24.3s    50    100   100   100   100   100    96   100    72    72     2     0    30    74   100   100   100   100   100    92   100    74    68     0     0    38    48
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Qwen3-8B-Q8_0 (llamaserver/native) [reforged]
+
+```
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q8_0 LS/N [reforged]              68.2%    68.5%    99.5%    95%   0.3  24.8s    50    100   100   100   100   100   100   100    56    78     6    24    28     6    98   100   100   100   100   100   100    52    84     6     2    30     2
+Qwen3-8B-Q8_0 LS/N [reforged:keep-last]    67.0%    67.2%    99.8%    92%   0.4  23.2s    50    100   100   100   100   100    94    98    48    84     2    22    10    12   100   100   100   100   100    96   100    52    80     0    18    20     6
+Qwen3-8B-Q8_0 LS/N [reforged:full]         69.3%    69.6%    99.6%    88%   0.6  24.7s    50     98   100   100   100   100    94   100    48    76     2    28    24    46   100   100   100   100   100    98   100    46    76     2     8    16    40
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Qwen3-14B-Q4_K_M (llamaserver/native) [reforged]
+
+```
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-14B-Q4_K_M LS/N [reforged]              68.5%    68.5%   100.0%   100%   0.4  21.8s    50    100   100   100   100    98   100   100    56    72    14     4    58     4   100   100   100   100   100   100   100    42    72    10     0    52     0
+Qwen3-14B-Q4_K_M LS/N [reforged:keep-last]    64.0%    64.0%    99.9%    91%   0.6  20.3s    50    100   100   100   100   100    90    98    48    40    18     6    38     4   100   100   100    98    96    88   100    54    30    16     2    38     0
+Qwen3-14B-Q4_K_M LS/N [reforged:full]         68.4%    68.4%    99.9%    83%   0.9  21.9s    50    100   100   100   100    98    90    98    60    32    20    18    50    38   100   100   100   100   100    86   100    74    34     6    18    34    22
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## gemma-4-E4B-it-Q4_K_M (llamaserver/native) [bare]
+
+```
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                    Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q4_K_M LS/N [bare]              67.5%    73.5%    91.8%   100%   0.3   7.5s    50    100   100   100   100   100    88    96     0    90     2    28    44    70   100   100   100   100   100    82   100     0    88     4     0    62     0
+gemma-4-E4B-it-Q4_K_M LS/N [bare:keep-last]    66.6%    75.6%    88.1%   100%   0.2   9.3s    50    100   100    68   100   100    86    90     0    96     0    24    78    84    98   100    66   100   100    76    88     0    96     4     0    78     0
+gemma-4-E4B-it-Q4_K_M LS/N [bare:full]         67.9%    77.5%    87.7%   100%   0.2   9.5s    50     96    98    78   100   100    94    84     0    92     2    24    80    80    98   100    78   100   100    92    92     0    94     0     0    84     0
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Qwen3-8B-Q4_K_M (llamaserver/native) [reforged]
+
+```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q4_K_M LS/N [reforged]              67.3%    67.5%    99.7%    96%   0.3  15.6s    50    100   100   100   100   100   100   100    40    98     6    14    22     2   100   100   100   100   100   100   100    48    86     2     0    26     6
+Qwen3-8B-Q4_K_M LS/N [reforged:keep-last]    64.5%    64.6%    99.9%    91%   0.4  15.0s    50    100   100   100   100   100   100   100    30    82     0    28    12    10   100   100   100    98   100    96   100    22    86     2     2     6     4
+Qwen3-8B-Q4_K_M LS/N [reforged:full]         65.8%    66.0%    99.7%    84%   0.7  17.2s    50    100   100   100   100   100    94   100    34    66     0    18    10    38   100   100   100    96   100    86   100    34    74     0    10    12    40
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## gemma-4-E4B-it-Q8_0 (llamaserver/native) [bare]
+
+```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q8_0 LS/N [bare]              66.2%    72.2%    91.7%   100%   0.2  10.1s    50     98   100   100   100   100    82    94     0    94     0    30    48    86   100   100   100   100   100    74    96     0    94     0     0    26     0
+gemma-4-E4B-it-Q8_0 LS/N [bare:keep-last]    67.0%    75.4%    88.8%   100%   0.2  14.0s    50     96   100    72    98   100    74    96     0    96     0    28    78    94    98   100    74   100    98    74    92     0    90     0     0    84     0
+gemma-4-E4B-it-Q8_0 LS/N [bare:full]         65.5%    75.3%    87.0%   100%   0.3  14.4s    50     96    96    76   100   100    88    86     0    80     0    14    84    80    92   100    74   100   100    76    88     0    88     0     0    86     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## granite-4.1-8b-Q4_K_M (llamaserver/native) [reforged]
+
+```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+granite-4.1-8b-Q4_K_M LS/N [reforged]              65.4%    68.0%    96.2%    90%   0.8   1.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [reforged:keep-last]    65.4%    68.0%    96.2%    90%   0.8   1.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [reforged:full]         65.4%    68.0%    96.2%    90%   0.8   1.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Qwen3-8B-Q8_0 (llamaserver/prompt) [bare]
+
+```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q8_0 LS/P [bare]              63.5%    69.6%    91.3%    97%   0.2  27.0s    50    100   100    96    98   100    98    96     0    90     0     8    18    92   100   100    94   100   100    64   100     0    84     0     2    12     0
+Qwen3-8B-Q8_0 LS/P [bare:keep-last]    63.4%    69.7%    90.9%    96%   0.2  28.5s    50    100   100    94   100   100    98    98     0    88     0     4    16    92   100   100    88   100    94    72    98     0    96     0     0    10     0
+Qwen3-8B-Q8_0 LS/P [bare:full]         63.9%    69.9%    91.5%    96%   0.2  27.3s    50    100   100    94   100   100    96    98     0    88     0     2    14    90   100   100    98   100   100    68    98     0    96     0     0    20     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Ministral-3-14B-Reasoning-2512-Q4_K_M (llamaserver/prompt) [bare]
+
+```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                    Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [bare]              63.6%    78.5%    81.0%   100%   0.3   3.3s    50    100   100   100   100    98    82    90     0    60    20     4    46    72   100   100   100   100    96    72    96     0    56    22     0    40     0
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [bare:keep-last]    63.4%    76.7%    82.7%   100%   0.3   3.2s    50    100   100   100   100    98    68    98     0    64    20     4    46    78   100   100    96   100   100    68    92     0    58    14     0    44     0
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [bare:full]         63.5%    77.3%    82.1%   100%   0.3   3.2s    50    100   100   100   100   100    76    96     0    64    12     2    40    70   100    98   100   100   100    70    92     0    60    18     0    52     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## gemma-4-E4B-it-Q4_K_M (llamaserver/prompt) [bare]
+
+```
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                    Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q4_K_M LS/P [bare]              62.5%    67.8%    92.2%    91%   0.4   9.0s    50    100   100   100   100   100    56   100     0    88     0    22    30   100   100   100   100   100   100    10   100     0    88     0     0    30     0
+gemma-4-E4B-it-Q4_K_M LS/P [bare:keep-last]    59.9%    65.0%    92.2%    92%   0.4   8.7s    50    100   100   100   100   100    44    98     0    90     0     4    22    92   100   100   100   100   100    12    92     0    92     0     0    12     0
+gemma-4-E4B-it-Q4_K_M LS/P [bare:full]         61.5%    66.7%    92.2%    91%   0.4   9.1s    50    100   100   100   100   100    46    94     0    86     0    14    28    88   100   100   100   100   100    22    96     0    94     0     0    30     0
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## gemma-4-E4B-it-Q8_0 (llamaserver/prompt) [bare]
+
+```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q8_0 LS/P [bare]              61.2%    66.3%    92.3%    94%   0.3  12.1s    50    100   100   100   100   100    64    96     0    84     0    22    22    94   100   100   100   100   100     0   100     0    80     0     0    30     0
+gemma-4-E4B-it-Q8_0 LS/P [bare:keep-last]    61.2%    66.4%    92.2%    95%   0.3  12.5s    50    100   100   100   100   100    68   100     0    84     0    22    20    90   100   100   100   100   100     2   100     0    82     0     0    24     0
+gemma-4-E4B-it-Q8_0 LS/P [bare:full]         60.5%    65.7%    92.1%    94%   0.3  12.6s    50    100   100   100   100   100    60    98     0    84     0    14    24    90   100   100   100   100   100     2    96     0    88     0     0    18     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Ministral-3-8B-Reasoning-2512-Q8_0 (llamaserver/prompt) [bare]
+
+```
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                 Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [bare]              58.6%    80.0%    73.3%   100%   0.4   3.3s    50     52    88    32   100    98   100   100     0    86    58     4    18    64    36    92    48   100    96    92   100     0    90    60     2     6     2
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [bare:keep-last]    59.2%    80.5%    73.5%   100%   0.4   3.1s    50     44    88    36   100    96    96   100     0    82    66     4    10    82    34    92    38   100    98    96   100     0    92    66     0    16     4
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [bare:full]         58.5%    79.5%    73.5%   100%   0.3   3.5s    50     26    98    40   100    98    96   100     0    92    54     6    14    86    44    92    38   100    96    92   100     0    92    44     0     8     4
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Qwen3-8B-Q4_K_M (llamaserver/prompt) [bare]
+
+```
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q4_K_M LS/P [bare]              57.4%    66.6%    86.2%    97%   0.2  17.2s    50     98   100    56    96   100    90   100     0    62     0    22     6    94   100   100    48    94   100    56   100     0    64     0     0     4     2
+Qwen3-8B-Q4_K_M LS/P [bare:keep-last]    59.1%    68.0%    86.9%    98%   0.2  16.6s    50     98   100    56    98   100    94   100     0    58     0    22     8    94   100   100    68    94    98    72   100     0    72     0     0     2     2
+Qwen3-8B-Q4_K_M LS/P [bare:full]         57.8%    66.9%    86.5%    97%   0.2  17.5s    50    100   100    66    94   100    96   100     0    54     0    24     8    96   100   100    58    96    98    68   100     0    44     0     0     2     0
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Ministral-3-8B-Reasoning-2512-Q4_K_M (llamaserver/prompt) [bare]
+
+```
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [bare]              55.2%    79.8%    69.2%   100%   0.4   2.3s    50     58    70    14   100    92    96   100     0    72    48     4    16    66    58    84    10   100    92    96    98     0    88    60     0    12     2
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [bare:keep-last]    54.3%    78.8%    68.9%   100%   0.3   2.4s    50     50    78     4   100    92    98   100     0    68    50    10    12    64    64    72    10   100    92    92    96     0    84    62     0    12     2
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [bare:full]         54.6%    78.7%    69.4%   100%   0.4   2.4s    50     58    74     4   100    86    94    98     0    76    54     2    12    62    66    78     8   100    94    96    98     0    86    54     0    18     2
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Qwen3-14B-Q4_K_M (llamaserver/prompt) [bare]
+
+```
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                               Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-14B-Q4_K_M LS/P [bare]              54.1%    63.2%    85.6%    95%   0.2  22.3s    50    100   100    14   100   100    92   100     0    64     0     0    34    46   100   100    20    88    98    50   100     0    68     0     0    16    16
+Qwen3-14B-Q4_K_M LS/P [bare:keep-last]    53.8%    63.2%    85.2%    95%   0.2  23.1s    50    100   100    20   100   100    94   100     0    60     0     2    30    56   100   100    18    88    98    38   100     0    62     0     0    24    10
+Qwen3-14B-Q4_K_M LS/P [bare:full]         53.9%    62.9%    85.8%    94%   0.2  24.4s    50    100   100    22   100   100    90   100     0    66     0     0    30    56   100   100    14    76   100    30   100     0    70     0     0    36    12
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## granite-4.1-8b-Q4_K_M (llamaserver/native) [bare]
+
+```
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                    Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+granite-4.1-8b-Q4_K_M LS/N [bare]              53.8%    70.0%    76.9%    96%   0.2   1.9s    50      0   100   100   100   100   100   100     0   100     0     0     0     0     0   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [bare:keep-last]    53.8%    70.0%    76.9%    96%   0.2   1.9s    50      0   100   100   100   100   100   100     0   100     0     0     0     0     0   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [bare:full]         53.8%    70.0%    76.9%    96%   0.2   2.0s    50      0   100   100   100   100   100   100     0   100     0     0     0     0     0   100   100   100   100   100   100     0   100     0     0     0     0
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Qwen3-8B-Q4_K_M (llamaserver/native) [bare]
+
+```
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q4_K_M LS/N [bare]              53.2%    64.9%    82.1%   100%   0.1  15.0s    50     92   100     4    86    92   100   100     0    62     2    32    26    12    86   100     6    96    90   100    98     0    76     2     0    22     0
+Qwen3-8B-Q4_K_M LS/N [bare:keep-last]    46.4%    64.3%    72.2%   100%   0.1  15.0s    50     96    78     2    86    70    94    76     0    74     4    34     4     8    80    76     4    76    76    96    86     0    76     0     4     6     0
+Qwen3-8B-Q4_K_M LS/N [bare:full]         45.2%    63.7%    71.0%   100%   0.1  13.6s    50     92    80     2    86    76   100    80     0    74     0    14    12     2    88    68     4    74    82    94    70     0    60     0     4    14     0
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Qwen3-8B-Q8_0 (llamaserver/native) [bare]
+
+```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q8_0 LS/N [bare]              50.4%    63.4%    79.5%   100%   0.1  23.1s    50     88   100     2    62    98   100    98     0    70     2    22    40     2    92   100     4    46    88    94    98     0    70     2     4    28     0
+Qwen3-8B-Q8_0 LS/N [bare:keep-last]    49.7%    65.4%    76.0%   100%   0.1  21.6s    50     86    78     2    88    88   100    80     0    64     2    18    18     8    90    88     2    84    94    96    86     0    92     0    12    16     0
+Qwen3-8B-Q8_0 LS/N [bare:full]         46.8%    63.0%    74.4%   100%   0.1  20.9s    50     84    70     0    84    96    96    82     0    62     0    12    14     8    88    70     2    98    88    84    86     0    74     0     4    16     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Qwen3-14B-Q4_K_M (llamaserver/native) [bare]
+
+```
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                               Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-14B-Q4_K_M LS/N [bare]              48.1%    60.0%    80.2%   100%   0.1  21.2s    50    100    96     0    38    66   100   100     0    62     8     0    48     2   100    98     0    42    56    94   100     0    72    12     0    56     0
+Qwen3-14B-Q4_K_M LS/N [bare:keep-last]    30.8%    52.1%    59.2%   100%   0.0  18.3s    50    100    28     6    24    60    72    34     0    12    16     4    34     0   100    12     8    46    52    60    46     0    24    28     6    30     0
+Qwen3-14B-Q4_K_M LS/N [bare:full]         27.5%    48.5%    56.8%   100%   0.0  16.5s    50     96    16     4    24    48    62    36     0    22     4     6    28     0   100    18     6    52    62    60    18     0    14     4     4    32     0
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Ministral-3-14B-Reasoning-2512-Q4_K_M (llamaserver/native) [bare]
+
+```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                    Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [bare]              40.2%    81.0%    49.7%   100%   0.4   4.5s    50      0   100    50    76    94    70     0     0    14     6    28    32    86     0   100    52    92    94    78     0     0    10     4    16    24    20
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [bare:keep-last]    41.5%    77.2%    53.7%   100%   0.3   5.5s    50      0   100    52    64    94    68     2     0    12     4    34    40    82     0   100    64    90   100    82     2     0    24     4    16    30    14
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [bare:full]         45.6%    80.2%    56.8%   100%   0.3   5.8s    50      0   100    72    78    98    72     6     0    26    18    34    36    94     0   100    44    96    92    84     2     0    16    28    26    40    24
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Ministral-3-8B-Reasoning-2512-Q8_0 (llamaserver/native) [bare]
+
+```
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                 Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [bare]              43.5%    80.3%    54.2%   100%   0.3   3.8s    50      2   100    98   100   100    86     0     0     2    14    12    14    72     2   100    98   100   100    86     0     0     6    16     0    22     0
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [bare:keep-last]    44.2%    79.3%    55.7%   100%   0.3   4.8s    50     10   100    92   100   100    94     0     0     2    16    26    30    52     4   100    94   100   100    78     0     0     4    18     4    20     4
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [bare:full]         44.3%    77.9%    56.8%   100%   0.3   5.7s    50      4   100    94   100   100    84     0     0     2    10    20    24    86     4   100    94   100   100    84     0     0     2    10     4    28     2
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Ministral-3-8B-Reasoning-2512-Q4_K_M (llamaserver/native) [bare]
+
+```
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [bare]              40.2%    77.0%    52.2%    98%   0.3   2.8s    50      6   100    84   100   100    68     0     0     4     6    18     8    62    10   100    86   100   100    70     0     0     2    12     0     8     0
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [bare:keep-last]    41.2%    77.3%    53.3%   100%   0.2   3.4s    50      6   100    82   100   100    78     0     0     4    10    20    20    58    10   100    90   100   100    70     0     0     4     6     4    10     0
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [bare:full]         41.8%    72.7%    57.5%   100%   0.3   3.7s    50      6   100    84   100   100    70     2     0     2    12     8    12    90    14   100    82   100   100    66     2     0     0    18     0    16     2
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+Scr=score(correct/total), Acc=accuracy(correct/total, excl validate errors), Cmp=completeness(completed/total), Eff=efficiency(ideal/actual calls), Wst=avg wasted calls, Spd=avg time(excl compaction)
+rel=relevance_detection, arg=argument_fidelity, tsl=tool_selection, b2s=basic_2step, s3s=sequential_3step, crt=conditional_routing, srn=sequential_reasoning, err=error_recovery, dgr=data_gap_recovery, dge=data_gap_recovery_extended, art=argument_transformation, grs=grounded_synthesis, iar=inconsistent_api_recovery, rel_s=relevance_detection_stateful, arg_s=argument_fidelity_stateful, tsl_s=tool_selection_stateful, b2s_s=basic_2step_stateful, s3s_s=sequential_3step_stateful, crt_s=conditional_routing_stateful, srn_s=sequential_reasoning_stateful, err_s=error_recovery_stateful, dgr_s=data_gap_recovery_stateful, dge_s=data_gap_recovery_extended_stateful, art_s=argument_transformation_stateful, grs_s=grounded_synthesis_stateful, iar_s=inconsistent_api_recovery_stateful
+Ablation: full=all guardrails, no_rescue=no rescue loop, no_nudge=no rescue/retry nudge, no_steps=no step enforcement, no_recovery=no error recovery, no_compact=no compaction, bare=all guardrails off
+Replay: ':keep-last'/':full' tags = reasoning_replay policy (how much captured reasoning is re-sent to the backend each turn); untagged = none (default). Rows predating the knob ran unbounded replay and count as full.
+
+Eval generations (older runs carried forward, superscript-tagged):
+  ¹ gen 1 — v0.6.0 suite — incl. Anthropic ablation (commit 2b05dc4, 2026-05-08)
+  ² gen 2 — v0.7.0 lineup refresh (8–14B) + 32GB tier debut (v0.7.4) (commit 655e1f6, 2026-05-22)
+
+*Generated 2026-06-11 20:28*
diff --git a/docs/results/raw/reforged-vs-bare.md b/docs/results/raw/reforged-vs-bare.md
index cd40f62..c557e14 100644
--- a/docs/results/raw/reforged-vs-bare.md
+++ b/docs/results/raw/reforged-vs-bare.md
@@ -1,105 +1,121 @@
 # Forge Eval — Reforged vs Bare
 
-## claude-opus-4-6 (anthropic/native)
+## claude-sonnet-4-6 (anthropic/native)
 
 ```
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                         Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-claude-opus-4-6 AN/N [reforged]¹    99.2%    99.8%    99.4%   100%   0.0  15.6s    50    100   100   100   100   100    98   100   100   100   100    98    94    98   100   100   100   100   100   100   100    96   100   100    98   100    98
-claude-opus-4-6 AN/N [bare+any]¹    87.1%    95.4%    91.3%   100%   0.0  12.9s    50    100   100   100   100    98   100   100     0   100   100    80   100   100   100   100   100   100   100   100   100     0   100   100    86   100     0
-claude-opus-4-6 AN/N [bare]¹        87.9%    95.8%    91.8%   100%   0.0  16.5s    50    100   100   100    98   100   100   100     0   100   100   100   100   100   100   100   100   100    98   100   100     0    98   100    96    96     0
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                          Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+claude-sonnet-4-6 AN/N [reforged]   100.0%   100.0%   100.0%   100%   0.0  18.2s    50    100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100
+claude-sonnet-4-6 AN/N [bare+any]    81.2%    87.9%    92.3%   100%   0.0   9.8s    50    100   100   100   100   100   100   100     0   100   100     6   100   100   100   100   100   100   100   100   100     0   100   100     4   100     0
+claude-sonnet-4-6 AN/N [bare]        88.4%    95.8%    92.3%   100%   0.0  18.0s    50    100   100   100   100   100   100   100     0   100   100    98   100   100   100   100   100   100   100   100   100     0   100   100   100   100     0
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## claude-sonnet-4-6 (anthropic/native)
+## claude-opus-4-8 (anthropic/native)
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                           Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-claude-sonnet-4-6 AN/N [reforged]¹    98.4%    98.5%    99.9%   100%   0.1  13.1s    50    100   100   100   100   100   100   100   100   100    98    74    98   100   100   100   100   100   100   100   100   100   100   100    88   100   100
-claude-sonnet-4-6 AN/N [bare]¹        85.1%    95.0%    89.5%   100%   0.0  14.3s    50    100   100    68    98   100   100   100     0   100    98    86    98   100   100   100    66   100   100   100   100     0   100   100    98   100     0
-claude-sonnet-4-6 AN/N [bare+any]¹    81.5%    88.2%    92.3%   100%   0.0  11.6s    50    100   100   100   100   100   100   100     0   100   100    12   100   100   100   100   100   100   100   100   100     0   100   100    16    90     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+claude-opus-4-8 AN/N [reforged]   100.0%   100.0%   100.0%   100%   0.0  13.3s    50    100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100
+claude-opus-4-8 AN/N [bare]        88.0%    95.8%    91.8%   100%   0.0  13.7s    50     90   100   100   100   100   100   100     0   100   100   100   100   100    98   100   100   100   100   100   100     0   100   100   100   100     0
+claude-opus-4-8 AN/N [bare+any]    83.6%    90.7%    92.2%   100%   0.0   9.3s    50    100   100   100   100   100   100   100     0   100   100    30   100   100   100   100   100   100   100   100   100     0   100   100    44   100     0
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## claude-opus-4-6 (anthropic/native)
+
+```
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+claude-opus-4-6 AN/N [reforged:full]¹    99.2%    99.8%    99.4%   100%   0.0  15.6s    50    100   100   100   100   100    98   100   100   100   100    98    94    98   100   100   100   100   100   100   100    96   100   100    98   100    98
+claude-opus-4-6 AN/N [bare+any:full]¹    87.1%    95.4%    91.3%   100%   0.0  12.9s    50    100   100   100   100    98   100   100     0   100   100    80   100   100   100   100   100   100   100   100   100     0   100   100    86   100     0
+claude-opus-4-6 AN/N [bare:full]¹        87.9%    95.8%    91.8%   100%   0.0  16.5s    50    100   100   100    98   100   100   100     0   100   100   100   100   100   100   100   100   100    98   100   100     0    98   100    96    96     0
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3.6-35B-A3B-UD-Q4_K_M (llamaserver/native)
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.6-35B-A3B-UD-Q4_K_M LS/N [reforged]    94.8%    95.1%    99.7%   100%   0.6  12.7s    50    100   100   100   100   100   100    96   100   100    72    78    92   100    98   100   100   100   100    98    92   100   100    68    76    94   100
-Qwen3.6-35B-A3B-UD-Q4_K_M LS/N [bare]        52.8%    88.4%    59.8%   100%   0.1  11.8s    50     14    72     2    92    68    92    42     0    98    58    66    52    90    10    80     4    86    56    90    40     0    90    70    56    46     0
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.6-35B-A3B-UD-Q4_K_M LS/N [reforged:full]²    94.8%    95.1%    99.7%   100%   0.6  12.7s    50    100   100   100   100   100   100    96   100   100    72    78    92   100    98   100   100   100   100    98    92   100   100    68    76    94   100
+Qwen3.6-35B-A3B-UD-Q4_K_M LS/N [bare:full]²        52.8%    88.4%    59.8%   100%   0.1  11.8s    50     14    72     2    92    68    92    42     0    98    58    66    52    90    10    80     4    86    56    90    40     0    90    70    56    46     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## claude-haiku-4-5-20251001 (anthropic/native)
 
 ```
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-claude-haiku-4-5-20251001 AN/N [reforged]¹    94.5%    94.9%    99.6%   100%   0.3   8.5s    50    100   100   100   100   100   100   100   100   100    80    80    98   100   100   100   100   100   100   100    94   100   100    76    36    94   100
-claude-haiku-4-5-20251001 AN/N [bare]¹        46.5%    86.2%    54.0%   100%   0.0   8.2s    50      0    96   100     0   100   100     0     0     4     4    74   100   100     0    96    96     0   100   100     0     0     0     6    36    98     0
-claude-haiku-4-5-20251001 AN/N [bare+any]¹    74.0%    80.2%    92.3%   100%   0.0   5.5s    50    100   100   100   100   100   100   100     0   100    92     0    22   100   100   100   100   100   100   100   100     0   100    82     2    26     0
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+claude-haiku-4-5-20251001 AN/N [reforged]    94.2%    94.2%    99.9%   100%   0.3   6.6s    50    100   100   100   100   100   100    98   100   100    74    74    98   100   100   100   100   100   100   100   100   100   100    72    38    94   100
+claude-haiku-4-5-20251001 AN/N [bare]        47.2%    87.1%    54.2%   100%   0.0   7.4s    50      0   100   100     2   100   100     0     0     0     2    82   100   100     0    98   100     0   100   100     0     0     0     2    42   100     0
+claude-haiku-4-5-20251001 AN/N [bare+any]    74.0%    80.2%    92.3%   100%   0.0   5.2s    50    100   100   100   100   100   100   100     0   100    86     0    32   100   100   100   100   100   100   100   100     0   100    84     0    22     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3.5-27B-Q4_K_M (llamaserver/native)
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                           Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.5-27B-Q4_K_M LS/N [reforged]    93.2%    93.3%    99.8%    82%   1.4  37.6s    50    100   100   100   100   100   100   100    98   100    74    38    88    98   100   100   100   100   100    98   100   100   100    78    56    96    98
-Qwen3.5-27B-Q4_K_M LS/N [bare]        15.8%   100.0%    15.8%   100%   0.0  11.0s    50     88     0    12    30     2    32     0     0     2     0     0    16     0    96     0    24    56     4    40     0     0     4     0     0     6     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                 Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.5-27B-Q4_K_M LS/N [reforged:full]²    93.2%    93.3%    99.8%    82%   1.4  37.6s    50    100   100   100   100   100   100   100    98   100    74    38    88    98   100   100   100   100   100    98   100   100   100    78    56    96    98
+Qwen3.5-27B-Q4_K_M LS/N [bare:full]²        15.8%   100.0%    15.8%   100%   0.0  11.0s    50     88     0    12    30     2    32     0     0     2     0     0    16     0    96     0    24    56     4    40     0     0     4     0     0     6     0
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3.6-27B-Q4_K_M (llamaserver/native)
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                           Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.6-27B-Q4_K_M LS/N [reforged]    92.2%    92.5%    99.6%   100%   0.4  37.9s    50    100   100   100   100   100   100   100    98   100    22    74    98   100   100   100   100   100   100   100   100    96    98    36    78    96   100
-Qwen3.6-27B-Q4_K_M LS/N [bare]        47.0%    92.4%    50.8%   100%   0.0  26.3s    50     52    88    82    68    28    42    48     0    72    26    22    62    22    62    88    80    66    40    42    54     0    80    24    12    62     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                 Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.6-27B-Q4_K_M LS/N [reforged:full]²    92.2%    92.5%    99.6%   100%   0.4  37.9s    50    100   100   100   100   100   100   100    98   100    22    74    98   100   100   100   100   100   100   100   100    96    98    36    78    96   100
+Qwen3.6-27B-Q4_K_M LS/N [bare:full]²        47.0%    92.4%    50.8%   100%   0.0  26.3s    50     52    88    82    68    28    42    48     0    72    26    22    62    22    62    88    80    66    40    42    54     0    80    24    12    62     0
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3.5-35B-A3B-Q4_K_M (llamaserver/native)
 
 ```
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                               Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.5-35B-A3B-Q4_K_M LS/N [reforged]    92.1%    92.4%    99.7%    82%   1.3  11.1s    50    100   100   100   100   100    96    98   100   100    96    14    84   100   100   100   100   100   100    96   100   100   100    94    20    96   100
-Qwen3.5-35B-A3B-Q4_K_M LS/N [bare]        12.2%    97.5%    12.5%   100%   0.0   3.9s    50     76     2     0    28     8    26     6     0     6     0     0     2     0    70     0     0    28     8    32    10     0     4     2     0     8     0
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                     Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.5-35B-A3B-Q4_K_M LS/N [reforged:full]²    92.1%    92.4%    99.7%    82%   1.3  11.1s    50    100   100   100   100   100    96    98   100   100    96    14    84   100   100   100   100   100   100    96   100   100   100    94    20    96   100
+Qwen3.5-35B-A3B-Q4_K_M LS/N [bare:full]²        12.2%    97.5%    12.5%   100%   0.0   3.9s    50     76     2     0    28     8    26     6     0     6     0     0     2     0    70     0     0    28     8    32    10     0     4     2     0     8     0
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3.5-27B-Q4_K_M (llamaserver/prompt)
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                           Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.5-27B-Q4_K_M LS/P [reforged]    86.8%    86.8%   100.0%   100%   0.1  24.4s    50    100   100   100   100   100   100   100   100   100    42    10    78   100   100   100   100   100   100   100   100   100   100    36    10    80   100
-Qwen3.5-27B-Q4_K_M LS/P [bare]        74.3%    81.0%    91.8%   100%   0.0  24.0s    50    100   100   100   100   100   100   100     0    92    38    14    84   100   100   100   100   100   100    68   100     0    98    44     6    88     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                 Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.5-27B-Q4_K_M LS/P [reforged:full]²    86.8%    86.8%   100.0%   100%   0.1  24.4s    50    100   100   100   100   100   100   100   100   100    42    10    78   100   100   100   100   100   100   100   100   100   100    36    10    80   100
+Qwen3.5-27B-Q4_K_M LS/P [bare:full]²        74.3%    81.0%    91.8%   100%   0.0  24.0s    50    100   100   100   100   100   100   100     0    92    38    14    84   100   100   100   100   100   100    68   100     0    98    44     6    88     0
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## Ministral-3-14B-Reasoning-2512-Q4_K_M (llamaserver/native)
+## Ministral-3-8B-Reasoning-2512-Q8_0 (llamaserver/native)
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged]    84.5%    84.5%   100.0%    97%   0.6   5.4s    50    100   100   100   100   100    88   100   100    70    44    48    76    94   100   100   100   100   100    96    98   100    76    38    26    62    82
-Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [bare]        43.3%    78.5%    55.2%   100%   0.3   5.6s    50      0   100    54    74    94    74     8     0    22    22    28    40    84     0   100    52    90    94    76     6     0    24    24    20    26    14
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                     Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged]              84.5%    84.8%    99.7%    96%   0.6   5.3s    50    100   100    96   100   100    98   100   100   100    76    18    44    92   100   100   100   100   100   100    98   100    98    80     2    42    54
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged:keep-last]    84.8%    85.6%    99.1%    94%   0.6   5.9s    50    100   100   100   100   100   100   100   100    98    70    24    42    86   100   100   100   100   100   100   100   100   100    82     6    36    62
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged:full]         83.1%    83.1%    99.9%    96%   0.5   6.0s    50    100   100   100   100   100   100    98   100   100    66    20    36    88   100   100   100   100   100   100   100   100    98    74     2    26    52
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [bare]                  43.5%    80.3%    54.2%   100%   0.3   3.8s    50      2   100    98   100   100    86     0     0     2    14    12    14    72     2   100    98   100   100    86     0     0     6    16     0    22     0
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [bare:keep-last]        44.2%    79.3%    55.7%   100%   0.3   4.8s    50     10   100    92   100   100    94     0     0     2    16    26    30    52     4   100    94   100   100    78     0     0     4    18     4    20     4
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [bare:full]             44.3%    77.9%    56.8%   100%   0.3   5.7s    50      4   100    94   100   100    84     0     0     2    10    20    24    86     4   100    94   100   100    84     0     0     2    10     4    28     2
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Ministral-3-8B-Instruct-2512-Q8_0 (llamaserver/prompt)
@@ -108,97 +124,128 @@ Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [bare]        43.3%    78.5%    55.2%
 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Model/Backend                                          Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Instruct-2512-Q8_0 LS/P [reforged]    84.4%    91.1%    92.6%    92%   0.7   4.6s    50    100   100     6   100   100   100   100   100   100    98     8   100    80   100   100     4   100   100   100   100   100   100   100     0    98   100
-Ministral-3-8B-Instruct-2512-Q8_0 LS/P [bare]        50.2%    82.2%    61.0%   100%   0.2   2.6s    50    100   100     0   100   100   100     0     0    90     0     4     0    70   100   100     0   100   100   100     6     0    88     0     0     2    44
+Ministral-3-8B-Instruct-2512-Q8_0 LS/P [reforged]    84.2%    90.9%    92.7%    91%   0.7   4.6s    50    100   100     6   100   100   100   100   100   100    98     8   100    80   100   100     4   100   100   100   100   100   100    96     0    98   100
+Ministral-3-8B-Instruct-2512-Q8_0 LS/P [bare]        50.2%    83.0%    60.5%   100%   0.2   2.7s    50    100   100     0   100    98   100     0     0    82     0     8     0    78   100   100     0   100    94   100     0     0    90     0     0     0    54
 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## Ministral-3-8B-Reasoning-2512-Q8_0 (llamaserver/native)
+## Qwen3.6-27B-Q4_K_M (llamaserver/prompt)
 
 ```
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                           Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged]    84.2%    84.2%   100.0%    95%   0.5   6.0s    50    100   100   100   100   100   100   100   100    98    74    26    54    88   100   100   100   100   100   100   100   100   100    68     2    26    52
-Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [bare]        43.6%    76.5%    57.0%   100%   0.2   5.3s    50      4   100    94   100   100    84     0     0     4    18    12    24    76     2   100    96   100   100    82     0     0     0    10     0    28     0
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                 Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.6-27B-Q4_K_M LS/P [reforged:full]²    83.5%    85.0%    98.2%    97%   0.4  53.9s    50    100   100   100   100   100   100   100   100    98     6    66    52    90   100   100   100   100   100    98   100    96    90     2    56    36    80
+Qwen3.6-27B-Q4_K_M LS/P [bare:full]²        69.0%    75.5%    91.4%   100%   0.2  50.5s    50    100   100   100   100   100   100   100     0   100     0    54    46    94   100    98   100   100    92     4   100     0    98     4    70    34     0
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## Qwen3.6-27B-Q4_K_M (llamaserver/prompt)
+## Ministral-3-14B-Reasoning-2512-Q4_K_M (llamaserver/native)
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                           Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.6-27B-Q4_K_M LS/P [reforged]    83.5%    85.0%    98.2%    97%   0.4  53.9s    50    100   100   100   100   100   100   100   100    98     6    66    52    90   100   100   100   100   100    98   100    96    90     2    56    36    80
-Qwen3.6-27B-Q4_K_M LS/P [bare]        69.0%    75.5%    91.4%   100%   0.2  50.5s    50    100   100   100   100   100   100   100     0   100     0    54    46    94   100    98   100   100    92     4   100     0    98     4    70    34     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged]              83.3%    83.3%   100.0%    96%   0.6   4.8s    50    100   100   100   100   100   100   100   100    60    32    34    78    94   100   100   100   100   100    92   100   100    62    30    28    78    78
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged:keep-last]    82.9%    82.9%   100.0%    95%   0.6   5.0s    50    100   100   100   100   100    92   100   100    68    40    30    60    94   100   100   100   100   100    96   100   100    62    36    20    72    86
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged:full]         83.2%    83.2%   100.0%    97%   0.6   5.6s    50    100   100   100   100   100    86    98   100    68    40    32    76    96   100   100   100   100   100    92   100   100    62    34    30    68    82
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [bare]                  40.2%    81.0%    49.7%   100%   0.4   4.5s    50      0   100    50    76    94    70     0     0    14     6    28    32    86     0   100    52    92    94    78     0     0    10     4    16    24    20
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [bare:keep-last]        41.5%    77.2%    53.7%   100%   0.3   5.5s    50      0   100    52    64    94    68     2     0    12     4    34    40    82     0   100    64    90   100    82     2     0    24     4    16    30    14
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [bare:full]             45.6%    80.2%    56.8%   100%   0.3   5.8s    50      0   100    72    78    98    72     6     0    26    18    34    36    94     0   100    44    96    92    84     2     0    16    28    26    40    24
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3.5-35B-A3B-Q4_K_M (llamaserver/prompt)
 
 ```
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                               Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.5-35B-A3B-Q4_K_M LS/P [reforged]    82.8%    82.8%   100.0%   100%   0.2  10.4s    50     48   100   100   100   100    94    98   100   100    74    16    62    90    56   100   100   100   100    96   100   100    98    68    14    58    82
-Qwen3.5-35B-A3B-Q4_K_M LS/P [bare]        59.7%    77.0%    77.5%   100%   0.1   9.8s    50     30   100     2    98    98    96    98     0    92    60    20    44    96    48   100     4    92   100    90    96     0    88    58    12    28     2
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                     Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.5-35B-A3B-Q4_K_M LS/P [reforged:full]²    82.8%    82.8%   100.0%   100%   0.2  10.4s    50     48   100   100   100   100    94    98   100   100    74    16    62    90    56   100   100   100   100    96   100   100    98    68    14    58    82
+Qwen3.5-35B-A3B-Q4_K_M LS/P [bare:full]²        59.7%    77.0%    77.5%   100%   0.1   9.8s    50     30   100     2    98    98    96    98     0    92    60    20    44    96    48   100     4    92   100    90    96     0    88    58    12    28     2
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Ministral-3-8B-Reasoning-2512-Q4_K_M (llamaserver/native)
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged]    82.8%    82.8%    99.9%    95%   0.5   4.1s    50    100   100   100   100   100   100   100   100    98    66    24    34    92   100   100   100   100   100    96   100   100   100    70     0    30    42
-Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [bare]        42.5%    74.8%    56.8%   100%   0.3   3.6s    50     12   100    90   100   100    72     2     0     2    12    12    14    84    20   100    78   100   100    64     2     0     0    10     2    24     4
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                       Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged]              81.4%    81.6%    99.7%    94%   0.6   3.8s    50    100   100    98   100   100   100   100   100   100    68    24    18    86   100   100    98   100   100   100   100   100    96    70     4    28    26
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged:keep-last]    82.4%    83.0%    99.3%    92%   0.6   4.2s    50    100   100   100   100   100    98   100   100   100    68    16    28    86   100   100   100   100   100    96    98   100    96    64     6    24    62
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged:full]         81.8%    81.8%   100.0%    95%   0.5   4.3s    50    100   100    98   100    98    98    98   100    98    62     8    30    96   100   100    98   100   100    98    96   100    98    62     2    40    46
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [bare]                  40.2%    77.0%    52.2%    98%   0.3   2.8s    50      6   100    84   100   100    68     0     0     4     6    18     8    62    10   100    86   100   100    70     0     0     2    12     0     8     0
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [bare:keep-last]        41.2%    77.3%    53.3%   100%   0.2   3.4s    50      6   100    82   100   100    78     0     0     4    10    20    20    58    10   100    90   100   100    70     0     0     4     6     4    10     0
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [bare:full]             41.8%    72.7%    57.5%   100%   0.3   3.7s    50      6   100    84   100   100    70     2     0     2    12     8    12    90    14   100    82   100   100    66     2     0     0    18     0    16     2
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3.6-35B-A3B-UD-Q4_K_M (llamaserver/prompt)
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.6-35B-A3B-UD-Q4_K_M LS/P [reforged]    82.2%    82.2%   100.0%   100%   0.3  23.6s    50     96   100   100   100   100    90    92    98    92    16    46    62    98    88   100   100   100   100    88    94    96    88     8    42    50    94
-Qwen3.6-35B-A3B-UD-Q4_K_M LS/P [bare]        65.5%    75.6%    86.6%   100%   0.1  23.0s    50     96   100    32   100   100    94    96     0    94    16    52    46    98    88   100    28   100   100    60    92     0   100    10    50    50     0
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.6-35B-A3B-UD-Q4_K_M LS/P [reforged:full]²    82.2%    82.2%   100.0%   100%   0.3  23.6s    50     96   100   100   100   100    90    92    98    92    16    46    62    98    88   100   100   100   100    88    94    96    88     8    42    50    94
+Qwen3.6-35B-A3B-UD-Q4_K_M LS/P [bare:full]²        65.5%    75.6%    86.6%   100%   0.1  23.0s    50     96   100    32   100   100    94    96     0    94    16    52    46    98    88   100    28   100   100    60    92     0   100    10    50    50     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## Ministral-3-8B-Instruct-2512-Q8_0 (llamaserver/native)
+## Ministral-3-8B-Reasoning-2512-Q8_0 (llamaserver/prompt)
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                          Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Instruct-2512-Q8_0 LS/N [reforged]    81.4%    81.4%   100.0%   100%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100    38     4     0   100   100   100   100   100   100   100   100   100   100    74     0     0   100
-Ministral-3-8B-Instruct-2512-Q8_0 LS/N [bare]        32.5%    62.0%    52.5%   100%   0.0   4.4s    50      0   100     0     0   100   100     0     0    66     0     4     0   100     0   100     0     0   100   100     0     0    74     2     0     0     0
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                     Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged]              80.9%    84.8%    95.4%    95%   0.7   3.9s    50    100   100    90   100   100    98   100    98   100    72     6    50    76   100   100    86   100   100   100   100    92    96    64     2    42    32
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged:keep-last]    81.2%    84.6%    96.0%    97%   0.7   4.1s    50    100   100    94   100   100    94   100    92    98    72    12    42    78   100   100    86   100   100    98   100    98   100    64     0    52    32
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged:full]         81.4%    84.8%    96.0%    95%   0.7   4.0s    50    100   100    88   100   100    94   100    96   100    72    12    58    80   100    98    88   100   100    96   100    92   100    64     2    36    40
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [bare]                  58.6%    80.0%    73.3%   100%   0.4   3.3s    50     52    88    32   100    98   100   100     0    86    58     4    18    64    36    92    48   100    96    92   100     0    90    60     2     6     2
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [bare:keep-last]        59.2%    80.5%    73.5%   100%   0.4   3.1s    50     44    88    36   100    96    96   100     0    82    66     4    10    82    34    92    38   100    98    96   100     0    92    66     0    16     4
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [bare:full]             58.5%    79.5%    73.5%   100%   0.3   3.5s    50     26    98    40   100    98    96   100     0    92    54     6    14    86    44    92    38   100    96    92   100     0    92    44     0     8     4
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## Ministral-3-8B-Reasoning-2512-Q8_0 (llamaserver/prompt)
+## Ministral-3-14B-Reasoning-2512-Q4_K_M (llamaserver/prompt)
 
 ```
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                           Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged]    81.3%    85.0%    95.7%    96%   0.7   3.9s    50    100   100    96   100   100    98   100    92    98    56    14    46    70   100   100    88   100   100    98   100    94   100    82     0    48    34
-Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [bare]        59.7%    80.7%    74.0%   100%   0.4   3.2s    50     48    92    38   100    96    98   100     0    94    54    10    16    78    54    92    32   100    98    98   100     0    90    48     0    14     2
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged]              77.7%    78.8%    98.5%    95%   0.6   3.8s    50    100   100   100   100   100    74   100   100    78    32     2    46    90   100   100   100   100   100    66   100   100    74    28     4    48    78
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged:keep-last]    78.0%    78.8%    98.9%    95%   0.6   3.8s    50    100   100   100   100   100    82   100   100    70    28     2    52    78   100   100   100   100   100    74   100   100    80    18     4    48    92
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged:full]         81.2%    82.0%    98.9%    98%   0.6   3.8s    50    100   100   100   100   100    74   100   100    68    40     4    72    88   100   100   100   100   100    82   100   100    78    38    10    64    92
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [bare]                  63.6%    78.5%    81.0%   100%   0.3   3.3s    50    100   100   100   100    98    82    90     0    60    20     4    46    72   100   100   100   100    96    72    96     0    56    22     0    40     0
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [bare:keep-last]        63.4%    76.7%    82.7%   100%   0.3   3.2s    50    100   100   100   100    98    68    98     0    64    20     4    46    78   100   100    96   100   100    68    92     0    58    14     0    44     0
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [bare:full]             63.5%    77.3%    82.1%   100%   0.3   3.2s    50    100   100   100   100   100    76    96     0    64    12     2    40    70   100    98   100   100   100    70    92     0    60    18     0    52     0
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Ministral-3-8B-Reasoning-2512-Q4_K_M (llamaserver/prompt)
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged]    80.5%    83.0%    97.0%    96%   0.7   2.7s    50    100    98    98   100   100    98    98   100    96    74    10    36    70   100    98    96   100   100    98   100    98    94    70     2    38    20
-Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [bare]        54.9%    79.3%    69.2%   100%   0.3   2.6s    50     56    78     8   100    96    98   100     0    68    60    10    18    64    60    82     8   100    96    92    96     0    72    56     0    10     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                       Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged]              79.5%    81.9%    97.0%    94%   0.7   2.8s    50    100   100    98   100   100    96   100    90    96    68     4    32    68   100    98    98   100   100    96    96    94    98    70     0    34    30
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged:keep-last]    79.8%    82.6%    96.7%    95%   0.6   2.8s    50    100   100    96   100   100    98    98    94    98    66     8    36    64   100   100    96   100   100   100    98    96    94    76     0    34    24
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged:full]         81.0%    83.1%    97.5%    95%   0.7   3.0s    50    100    98    88   100   100    96   100    98    98    84    24    38    70   100   100   100   100   100    96    98   100    96    72     2    22    26
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [bare]                  55.2%    79.8%    69.2%   100%   0.4   2.3s    50     58    70    14   100    92    96   100     0    72    48     4    16    66    58    84    10   100    92    96    98     0    88    60     0    12     2
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [bare:keep-last]        54.3%    78.8%    68.9%   100%   0.3   2.4s    50     50    78     4   100    92    98   100     0    68    50    10    12    64    64    72    10   100    92    92    96     0    84    62     0    12     2
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [bare:full]             54.6%    78.7%    69.4%   100%   0.4   2.4s    50     58    74     4   100    86    94    98     0    76    54     2    12    62    66    78     8   100    94    96    98     0    86    54     0    18     2
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## Ministral-3-8B-Instruct-2512-Q8_0 (llamaserver/native)
+
+```
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                          Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Instruct-2512-Q8_0 LS/N [reforged]    81.0%    81.0%   100.0%   100%   0.3   4.1s    50    100   100   100   100   100   100   100   100   100    30     0     4   100   100   100   100   100   100   100   100   100   100    68     0     4   100
+Ministral-3-8B-Instruct-2512-Q8_0 LS/N [bare]        33.0%    62.2%    53.1%   100%   0.0   4.4s    50      0   100     0     0   100   100     0     0    68     0     2     2   100     0   100     0     0   100   100     0     0    78     2     4     2     0
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Ministral-3-14B-Instruct-2512-Q4_K_M (llamaserver/prompt)
@@ -207,31 +254,35 @@ Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [bare]        54.9%    79.3%    69.2%
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Model/Backend                                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [reforged]    80.2%    80.2%   100.0%   100%   0.0   2.9s    50    100   100   100   100   100   100   100   100   100     0     0    36   100   100   100   100   100   100   100   100   100   100     0     0    50   100
-Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [bare]        66.5%    80.0%    83.1%   100%   0.0   2.4s    50    100   100   100   100   100   100   100     0   100     0     0     0   100   100   100   100   100   100   100   100     0    96     0     0     0    32
+Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [reforged]    80.6%    80.6%   100.0%   100%   0.0   3.0s    50    100   100   100   100   100   100   100   100   100     0     0    46   100   100   100   100   100   100   100   100   100    98     0     0    52   100
+Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [bare]        66.5%    79.9%    83.2%   100%   0.0   2.4s    50    100   100   100   100   100   100   100     0   100     0     0     0   100   100   100   100   100   100   100   100     0    96     0     0     0    32
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## Ministral-3-14B-Reasoning-2512-Q4_K_M (llamaserver/prompt)
+## gemma-4-E4B-it-Q4_K_M (llamaserver/native)
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged]    79.5%    80.5%    98.7%    96%   0.5   3.7s    50    100   100   100   100   100    82   100   100    78    30     6    58    92   100   100   100   100   100    74   100   100    80    20     6    56    84
-Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [bare]        65.2%    78.1%    83.5%   100%   0.3   3.4s    50    100    98   100   100    98    74    92     0    62    30     0    56    70   100   100   100   100    96    78    96     0    72    14     0    58     0
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q4_K_M LS/N [reforged]              79.7%    79.9%    99.8%   100%   0.3   8.1s    50    100   100   100   100   100    94    98   100    84     8    30    64    92   100   100   100   100   100    88    94   100    86     2     0    48    84
+gemma-4-E4B-it-Q4_K_M LS/N [reforged:keep-last]    78.7%    81.4%    96.7%    99%   0.5   9.3s    50    100   100   100   100   100    96    92    96    96     6    20    64    62   100   100   100   100   100    88    88   100    98     2     0    82    56
+gemma-4-E4B-it-Q4_K_M LS/N [reforged:full]         79.2%    82.2%    96.3%    99%   0.5  10.0s    50    100   100   100   100   100    96    88    94   100     2    40    78    54   100   100   100   100   100    94    84    98    96     0     0    82    52
+gemma-4-E4B-it-Q4_K_M LS/N [bare]                  67.5%    73.5%    91.8%   100%   0.3   7.5s    50    100   100   100   100   100    88    96     0    90     2    28    44    70   100   100   100   100   100    82   100     0    88     4     0    62     0
+gemma-4-E4B-it-Q4_K_M LS/N [bare:keep-last]        66.6%    75.6%    88.1%   100%   0.2   9.3s    50    100   100    68   100   100    86    90     0    96     0    24    78    84    98   100    66   100   100    76    88     0    96     4     0    78     0
+gemma-4-E4B-it-Q4_K_M LS/N [bare:full]             67.9%    77.5%    87.7%   100%   0.2   9.5s    50     96    98    78   100   100    94    84     0    92     2    24    80    80    98   100    78   100   100    92    92     0    94     0     0    84     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## qwen3:14b-q4_K_M (ollama/native)
 
 ```
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                         Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-qwen3:14b-q4_K_M OL/N [reforged]    78.6%    78.7%    99.9%    77%   1.2  38.5s    50    100   100   100   100   100   100   100   100    74     4    12    68    78   100   100   100   100   100   100   100    94    88     4     0    54    68
-qwen3:14b-q4_K_M OL/N [bare]        46.5%    61.3%    75.8%    87%   0.7  34.7s    50     90    92     4     4    86   100   100     0    78     6     4    48     6    86    92     2    26    84    68   100     0    86     0     4    42     0
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                               Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+qwen3:14b-q4_K_M OL/N [reforged:full]²    78.6%    78.7%    99.9%    77%   1.2  38.5s    50    100   100   100   100   100   100   100   100    74     4    12    68    78   100   100   100   100   100   100   100    94    88     4     0    54    68
+qwen3:14b-q4_K_M OL/N [bare:full]²        46.5%    61.3%    75.8%    87%   0.7  34.7s    50     90    92     4     4    86   100   100     0    78     6     4    48     6    86    92     2    26    84    68   100     0    86     0     4    42     0
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Ministral-3-8B-Instruct-2512-Q4_K_M (llamaserver/native)
@@ -240,31 +291,20 @@ qwen3:14b-q4_K_M OL/N [bare]        46.5%    61.3%    75.8%    87%   0.7  34.7s
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Model/Backend                                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [reforged]    78.3%    78.4%    99.8%    95%   0.4   3.2s    50    100   100   100   100   100   100   100    98   100    22     0     0   100   100   100   100   100   100   100   100    98   100    14     2     2   100
-Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [bare]        27.3%    61.4%    44.5%   100%   0.0   3.8s    50      0   100     0     0   100   100     0     0     8     0     0     0   100     0   100     0     0   100   100     0     0     2     0     0     0     0
+Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [reforged]    78.3%    78.3%   100.0%    95%   0.4   3.2s    50    100   100   100   100   100   100   100   100   100    18     0     0   100   100   100   100   100   100   100   100   100   100    16     2     0   100
+Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [bare]        27.3%    62.8%    43.5%   100%   0.0   3.8s    50      0   100     0     0   100   100     0     0     4     0     0     2   100     0   100     0     0   100   100     0     0     4     0     0     0     0
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## gemma-4-E4B-it-Q4_K_M (llamaserver/native)
-
-```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-gemma-4-E4B-it-Q4_K_M LS/N [reforged]    78.2%    82.2%    95.1%    98%   0.5   9.0s    50    100   100   100   100   100    92    98    98    90     0    24    80    50   100   100   100   100   100    94    90    94    98     0     0    84    40
-gemma-4-E4B-it-Q4_K_M LS/N [bare]        67.8%    77.0%    88.1%   100%   0.2   9.4s    50     96   100    82   100   100    96    86     0    94     0    18    80    78    96   100    86   100    96    92    88     0    94     0     0    82     0
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-```
-
 ## Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M (llamaserver/prompt)
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [reforged]    78.2%    84.3%    92.7%    78%   1.1   3.6s    50    100   100   100   100   100   100   100   100    28     0     0   100    94   100   100   100   100   100   100   100   100    34     0     0   100    76
-Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [bare]        65.7%    81.1%    81.0%    80%   0.9   3.4s    50    100   100    96   100    98   100   100     0    32     0     0   100    46   100   100    98   100    98   100   100     0    40     0     0   100     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                         Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [reforged:full]²    78.2%    84.3%    92.7%    78%   1.1   3.6s    50    100   100   100   100   100   100   100   100    28     0     0   100    94   100   100   100   100   100   100   100   100    34     0     0   100    76
+Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [bare:full]²        65.7%    81.1%    81.0%    80%   0.9   3.4s    50    100   100    96   100    98   100   100     0    32     0     0   100    46   100   100    98   100    98   100   100     0    40     0     0   100     0
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Ministral-3-14B-Instruct-2512-Q4_K_M (llamaserver/native)
@@ -273,20 +313,35 @@ Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [bare]        65.7%    81.1%
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Model/Backend                                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [reforged]    78.1%    78.1%   100.0%    97%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100    16     0     0   100   100   100   100   100   100   100   100   100   100    14     0     0   100
-Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [bare]        27.5%    80.8%    34.1%   100%   0.0   3.0s    50      0   100     0     0   100   100     0     0    10     0     0     0   100     0   100     0     0   100    98     0     0     8     0     0     0     0
+Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [reforged]    77.8%    77.8%   100.0%    97%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100     8     0     0   100   100   100   100   100   100   100   100   100   100    14     0     0   100
+Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [bare]        27.6%    82.2%    33.6%   100%   0.0   3.0s    50      0   100     0     0   100   100     0     0    12     0     0     0   100     0   100     0     0   100    98     0     0     8     0     0     0     0
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## gemma-4-E4B-it-Q8_0 (llamaserver/native)
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-gemma-4-E4B-it-Q8_0 LS/N [reforged]    76.2%    80.7%    94.5%    98%   0.6  12.8s    50    100   100   100   100   100    84    88    90    96     2    14    80    44   100   100   100   100   100    88    90    96    94     4     0    80    32
-gemma-4-E4B-it-Q8_0 LS/N [bare]        67.7%    76.9%    88.1%   100%   0.3  13.8s    50    100   100    78   100    98    92    90     0    98     4    10    84    92    98    94    84   100   100    84    88     0    84     4     0    78     0
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                      Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q8_0 LS/N [reforged]              77.8%    77.9%    99.8%   100%   0.2  10.8s    50    100   100   100   100   100    76    90   100   100     0    18    38    98   100   100   100   100   100    76    98   100    98     0     0    36    94
+gemma-4-E4B-it-Q8_0 LS/N [reforged:keep-last]    75.7%    79.3%    95.5%    99%   0.5  13.1s    50    100   100   100   100   100    76    92    98    96     4    16    82    48   100   100   100   100   100    50    86    98    94     2     0    86    40
+gemma-4-E4B-it-Q8_0 LS/N [reforged:full]         75.6%    80.8%    93.6%    98%   0.6  12.5s    50    100   100   100   100   100    92    94    92    88     0    18    82    28   100   100   100   100   100    76    92    94    98     2     0    84    26
+gemma-4-E4B-it-Q8_0 LS/N [bare]                  66.2%    72.2%    91.7%   100%   0.2  10.1s    50     98   100   100   100   100    82    94     0    94     0    30    48    86   100   100   100   100   100    74    96     0    94     0     0    26     0
+gemma-4-E4B-it-Q8_0 LS/N [bare:keep-last]        67.0%    75.4%    88.8%   100%   0.2  14.0s    50     96   100    72    98   100    74    96     0    96     0    28    78    94    98   100    74   100    98    74    92     0    90     0     0    84     0
+gemma-4-E4B-it-Q8_0 LS/N [bare:full]             65.5%    75.3%    87.0%   100%   0.3  14.4s    50     96    96    76   100   100    88    86     0    80     0    14    84    80    92   100    74   100   100    76    88     0    88     0     0    86     0
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+## phi-4-Q4_K_M (llamaserver/prompt)
+
+```
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                     Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+phi-4-Q4_K_M LS/P [reforged]    75.3%    75.4%    99.8%    83%   0.9   4.2s    50    100   100   100   100   100    26    62    94    96    62    34    66    70   100   100   100   100   100    28    84    98    94    42     0    60    42
+phi-4-Q4_K_M LS/P [bare]        57.9%    68.0%    85.2%    91%   0.5   3.4s    50    100   100   100    98   100    22    58     0    80    28    20    16    54   100   100   100    98   100    14    68     0    82    36     2    28     2
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Ministral-3-8B-Instruct-2512-Q4_K_M (llamaserver/prompt)
@@ -295,229 +350,239 @@ gemma-4-E4B-it-Q8_0 LS/N [bare]        67.7%    76.9%    88.1%   100%   0.3  13.
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Model/Backend                                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [reforged]    75.6%    83.8%    90.2%    79%   1.3   3.0s    50     98   100     0   100   100   100   100   100   100    22    12    56   100   100   100     0   100   100   100   100   100   100    28     0    50   100
-Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [bare]        45.8%    81.4%    56.3%    94%   0.8   2.6s    50      0   100     0   100   100   100     0     0    98     0    16    26   100     0   100     0   100   100   100     0     0   100     0     0    52     0
+Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [reforged]    74.9%    83.3%    89.9%    79%   1.3   3.1s    50    100   100     0   100   100   100   100   100   100    20     2    42   100   100   100     0   100   100   100   100   100   100    22     0    62   100
+Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [bare]        45.8%    81.5%    56.2%    94%   0.7   2.6s    50      0   100     0   100   100   100     2     0    98     0    16    36   100     0   100     0   100   100   100     4     0    98     0     0    38     0
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## ministral-3:14b-instruct-2512-q4_K_M (ollama/native)
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-ministral-3:14b-instruct-2512-q4_K_M OL/N [reforged]    74.8%    74.8%   100.0%    81%   1.0   6.2s    50    100   100   100   100   100   100   100   100    96    56     0     0    80   100   100   100   100   100   100   100   100    98     0     0     0    16
-ministral-3:14b-instruct-2512-q4_K_M OL/N [bare]        32.0%    84.4%    37.9%   100%   0.1   3.3s    50    100   100   100     0   100     0     0     0     0     0     0     0    30   100   100   100     0   100     0     0     0     2     0     0     0     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ministral-3:14b-instruct-2512-q4_K_M OL/N [reforged:full]²    74.8%    74.8%   100.0%    81%   1.0   6.2s    50    100   100   100   100   100   100   100   100    96    56     0     0    80   100   100   100   100   100   100   100   100    98     0     0     0    16
+ministral-3:14b-instruct-2512-q4_K_M OL/N [bare:full]²        32.0%    84.4%    37.9%   100%   0.1   3.3s    50    100   100   100     0   100     0     0     0     0     0     0     0    30   100   100   100     0   100     0     0     0     2     0     0     0     0
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## gemma4:e4b-it-q4_K_M (ollama/native)
 
 ```
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-gemma4:e4b-it-q4_K_M OL/N [reforged]    74.8%    75.0%    99.8%    83%   0.8  11.3s    50    100   100   100   100   100    94   100   100    92     0     0    44    66   100   100   100   100   100    90   100   100    78     0     0    40    42
-gemma4:e4b-it-q4_K_M OL/N [bare]        62.3%    71.7%    86.9%    90%   0.4   8.7s    50     98   100    88   100    98    96   100     0    82     0     2    50    24    96   100    92   100   100    84    98     0    72     2     0    38     0
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma4:e4b-it-q4_K_M OL/N [reforged:full]²    74.8%    75.0%    99.8%    83%   0.8  11.3s    50    100   100   100   100   100    94   100   100    92     0     0    44    66   100   100   100   100   100    90   100   100    78     0     0    40    42
+gemma4:e4b-it-q4_K_M OL/N [bare:full]²        62.3%    71.7%    86.9%    90%   0.4   8.7s    50     98   100    88   100    98    96   100     0    82     0     2    50    24    96   100    92   100   100    84    98     0    72     2     0    38     0
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## gemma-4-E4B-it-Q8_0 (llamaserver/prompt)
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-gemma-4-E4B-it-Q8_0 LS/P [reforged]    74.7%    74.7%   100.0%    85%   0.6  12.7s    50    100   100   100   100   100    70   100    90    88     0    16    34    94   100   100   100   100   100    48   100    98    84     0     0    30    90
-gemma-4-E4B-it-Q8_0 LS/P [bare]        61.9%    67.1%    92.2%    94%   0.3  12.2s    50    100   100   100   100   100    58   100     0    90     0    24    20    98   100   100   100   100   100     4   100     0    90     0     0    26     0
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                      Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q8_0 LS/P [reforged]              73.2%    73.3%    99.8%    85%   0.6  13.3s    50    100   100   100   100   100    54    98    94    80     0    28    20    94   100   100   100   100   100    40   100    98    90     0     0    12    96
+gemma-4-E4B-it-Q8_0 LS/P [reforged:keep-last]    74.1%    74.2%    99.8%    85%   0.6  13.4s    50    100   100   100   100   100    48   100    96    84     0    22    28    94   100   100   100   100   100    52   100    98    90     0     0    18    96
+gemma-4-E4B-it-Q8_0 LS/P [reforged:full]         73.7%    73.7%   100.0%    86%   0.6  12.7s    50    100   100   100   100   100    48   100    92    90     0    36    28    88   100   100   100   100   100    48    94    98    82     0     0    20    92
+gemma-4-E4B-it-Q8_0 LS/P [bare]                  61.2%    66.3%    92.3%    94%   0.3  12.1s    50    100   100   100   100   100    64    96     0    84     0    22    22    94   100   100   100   100   100     0   100     0    80     0     0    30     0
+gemma-4-E4B-it-Q8_0 LS/P [bare:keep-last]        61.2%    66.4%    92.2%    95%   0.3  12.5s    50    100   100   100   100   100    68   100     0    84     0    22    20    90   100   100   100   100   100     2   100     0    82     0     0    24     0
+gemma-4-E4B-it-Q8_0 LS/P [bare:full]             60.5%    65.7%    92.1%    94%   0.3  12.6s    50    100   100   100   100   100    60    98     0    84     0    14    24    90   100   100   100   100   100     2    96     0    88     0     0    18     0
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## gemma4:e4b-it-q8_0 (ollama/native)
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                           Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-gemma4:e4b-it-q8_0 OL/N [reforged]    73.6%    73.8%    99.8%    85%   0.8  12.8s    50    100   100   100   100   100    78    98   100   100     0     8    34    60   100   100   100   100   100    78    94   100    96     0     0    34    34
-gemma4:e4b-it-q8_0 OL/N [bare]        62.5%    69.9%    89.3%    89%   0.5  12.0s    50     90   100   100   100   100    86    92     0    72     0     2    48    50   100   100    98   100   100    66    94     0    84     0     0    42     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-```
-
-## Qwen3-8B-Q8_0 (llamaserver/prompt)
-
-```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                      Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3-8B-Q8_0 LS/P [reforged]    73.1%    73.2%    99.8%    89%   0.4  28.4s    50    100   100   100   100   100   100   100    58    96     0     8    28    94   100   100   100   100    96   100    98    64    88     0     0    12    58
-Qwen3-8B-Q8_0 LS/P [bare]        64.5%    70.4%    91.6%    96%   0.2  27.4s    50    100   100    92   100   100    96   100     0    86     0     4    20   100   100   100   100    98    98    76    98     0    88     0     0    20     0
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-```
-
-## phi-4-Q4_K_M (llamaserver/prompt)
-
-```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                     Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-phi-4-Q4_K_M LS/P [reforged]    72.9%    73.3%    99.5%    85%   0.9   4.1s    50    100   100   100   100   100    34    56    96    90    52    24    38    70   100   100   100   100   100    24    62    92    98    52     0    60    48
-phi-4-Q4_K_M LS/P [bare]        59.2%    69.4%    85.3%    92%   0.5   3.3s    50    100   100   100   100   100    22    68     0    88    44    28    24    32   100   100   100   100   100    10    76     0    92    38     0    18     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                 Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma4:e4b-it-q8_0 OL/N [reforged:full]²    73.6%    73.8%    99.8%    85%   0.8  12.8s    50    100   100   100   100   100    78    98   100   100     0     8    34    60   100   100   100   100   100    78    94   100    96     0     0    34    34
+gemma4:e4b-it-q8_0 OL/N [bare:full]²        62.5%    69.9%    89.3%    89%   0.5  12.0s    50     90   100   100   100   100    86    92     0    72     0     2    48    50   100   100    98   100   100    66    94     0    84     0     0    42     0
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## gemma-4-E4B-it-Q4_K_M (llamaserver/prompt)
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-gemma-4-E4B-it-Q4_K_M LS/P [reforged]    72.8%    72.8%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    38    98    94    94     0    18    26    96   100   100   100   100   100    26    98   100    92     0     2    22    90
-gemma-4-E4B-it-Q4_K_M LS/P [bare]        60.6%    65.8%    92.1%    92%   0.5   8.4s    50    100   100   100   100   100    54    96     0    90     0    14    22    92   100   100   100   100   100    12   100     0    76     0     0    20     0
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q4_K_M LS/P [reforged]              72.4%    72.4%    99.9%    85%   0.6   8.9s    50    100   100   100   100   100    56    96   100    86     0    14    28    84   100   100   100   100   100    34   100    96    74     0     0    22    92
+gemma-4-E4B-it-Q4_K_M LS/P [reforged:keep-last]    73.2%    73.2%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    40    98    96    92     0    10    24    96   100   100   100   100   100    38   100   100    86     0     0    32    92
+gemma-4-E4B-it-Q4_K_M LS/P [reforged:full]         72.9%    72.9%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    54    98    98    86     0    10    28    88   100   100   100   100   100    26    98    96    94     0     0    30    90
+gemma-4-E4B-it-Q4_K_M LS/P [bare]                  62.5%    67.8%    92.2%    91%   0.4   9.0s    50    100   100   100   100   100    56   100     0    88     0    22    30   100   100   100   100   100   100    10   100     0    88     0     0    30     0
+gemma-4-E4B-it-Q4_K_M LS/P [bare:keep-last]        59.9%    65.0%    92.2%    92%   0.4   8.7s    50    100   100   100   100   100    44    98     0    90     0     4    22    92   100   100   100   100   100    12    92     0    92     0     0    12     0
+gemma-4-E4B-it-Q4_K_M LS/P [bare:full]             61.5%    66.7%    92.2%    91%   0.4   9.1s    50    100   100   100   100   100    46    94     0    86     0    14    28    88   100   100   100   100   100    22    96     0    94     0     0    30     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## Nemotron-3-Nano-30B-A3B-Q4_K_M (llamaserver/native)
+## Qwen3-8B-Q8_0 (llamaserver/prompt)
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                       Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Nemotron-3-Nano-30B-A3B-Q4_K_M LS/N [reforged]    71.3%    81.0%    88.0%    72%   1.5  21.4s    50    100   100   100   100   100    66    98    52    92    28     4    34    34   100   100   100    98   100    86    92    68    98    24     8    34    38
-Nemotron-3-Nano-30B-A3B-Q4_K_M LS/N [bare]        37.5%    85.7%    43.7%    95%   0.2   6.6s    50     32   100    88    74    86    70     6     0    20    10     0    14     4    14   100    94    72    92    56     0     0    18    14     0    10     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q8_0 LS/P [reforged]              72.0%    72.3%    99.6%    88%   0.4  28.6s    50    100   100   100   100   100    96   100    56    90     0     2    10    96   100   100   100   100    96    98   100    58    88     0     0    20    62
+Qwen3-8B-Q8_0 LS/P [reforged:keep-last]    72.8%    73.0%    99.7%    89%   0.4  28.0s    50    100   100   100   100   100    98    98    80    90     0     6     8    94   100   100   100   100    96    96   100    60    98     0     2    12    54
+Qwen3-8B-Q8_0 LS/P [reforged:full]         72.8%    72.9%    99.8%    88%   0.4  28.9s    50    100   100   100   100   100    98   100    70    90     0     4    20    96   100   100   100   100    92   100    96    66    92     0     0    12    56
+Qwen3-8B-Q8_0 LS/P [bare]                  63.5%    69.6%    91.3%    97%   0.2  27.0s    50    100   100    96    98   100    98    96     0    90     0     8    18    92   100   100    94   100   100    64   100     0    84     0     2    12     0
+Qwen3-8B-Q8_0 LS/P [bare:keep-last]        63.4%    69.7%    90.9%    96%   0.2  28.5s    50    100   100    94   100   100    98    98     0    88     0     4    16    92   100   100    88   100    94    72    98     0    96     0     0    10     0
+Qwen3-8B-Q8_0 LS/P [bare:full]             63.9%    69.9%    91.5%    96%   0.2  27.3s    50    100   100    94   100   100    96    98     0    88     0     2    14    90   100   100    98   100   100    68    98     0    96     0     0    20     0
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M (llamaserver/native)
+## Qwen3-8B-Q4_K_M (llamaserver/prompt)
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/N [reforged]    71.0%    71.2%    99.8%    96%   0.5   6.5s    50    100   100   100   100    98    58   100   100    68     4    12    20    90   100   100   100   100   100    50   100   100     4    22     0    20   100
-Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/N [bare]        42.5%    67.4%    63.1%   100%   0.0   6.1s    50      0   100   100   100   100    68    10     0     4     4    12    12    98     0   100   100   100   100    56    20     0     0     0     2    20     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q4_K_M LS/P [reforged]              71.1%    71.2%    99.8%    87%   0.5  18.0s    50    100   100   100   100   100    96   100    70    66     0    18     8    96   100   100   100   100    98    96   100    60    60     0     0    10    70
+Qwen3-8B-Q4_K_M LS/P [reforged:keep-last]    72.2%    72.3%    99.9%    88%   0.5  17.9s    50    100   100   100   100   100   100   100    62    66     0    30     8    90   100   100   100   100   100    98   100    74    68     0     0     8    74
+Qwen3-8B-Q4_K_M LS/P [reforged:full]         70.5%    70.8%    99.6%    88%   0.4  17.4s    50    100   100   100   100   100    88   100    58    66     0    24    10    88   100   100   100   100   100    94    98    62    66     0     0     4    76
+Qwen3-8B-Q4_K_M LS/P [bare]                  57.4%    66.6%    86.2%    97%   0.2  17.2s    50     98   100    56    96   100    90   100     0    62     0    22     6    94   100   100    48    94   100    56   100     0    64     0     0     4     2
+Qwen3-8B-Q4_K_M LS/P [bare:keep-last]        59.1%    68.0%    86.9%    98%   0.2  16.6s    50     98   100    56    98   100    94   100     0    58     0    22     8    94   100   100    68    94    98    72   100     0    72     0     0     2     2
+Qwen3-8B-Q4_K_M LS/P [bare:full]             57.8%    66.9%    86.5%    97%   0.2  17.5s    50    100   100    66    94   100    96   100     0    54     0    24     8    96   100   100    58    96    98    68   100     0    44     0     0     2     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## ministral-3:8b-instruct-2512-q8_0 (ollama/native)
+## Qwen3-14B-Q4_K_M (llamaserver/prompt)
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                          Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-ministral-3:8b-instruct-2512-q8_0 OL/N [reforged]    70.7%    74.5%    94.9%    74%   1.1   5.9s    50    100   100   100   100   100   100   100   100    92    12    42     0    42   100   100   100   100   100   100   100   100    26     4     8     0    12
-ministral-3:8b-instruct-2512-q8_0 OL/N [bare]        17.8%    49.6%    36.0%    93%   0.4   6.8s    50      0     0     0     0    68   100     0     0    14     0    20     0    96     0     0     0     0    64   100     0     0     0     0     2     0     0
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-14B-Q4_K_M LS/P [reforged]              71.4%    71.4%    99.9%    86%   0.5  25.6s    50    100   100   100   100   100    98   100    70    70     0     0    30    80   100   100   100   100   100    92   100    76    56     0     0    22    62
+Qwen3-14B-Q4_K_M LS/P [reforged:keep-last]    71.8%    71.8%   100.0%    86%   0.5  23.8s    50    100   100   100   100   100    98   100    72    58     2     4    28    72   100   100   100   100   100    94   100    76    74     0     0    32    56
+Qwen3-14B-Q4_K_M LS/P [reforged:full]         71.8%    71.9%    99.8%    87%   0.5  24.3s    50    100   100   100   100   100    96   100    72    72     2     0    30    74   100   100   100   100   100    92   100    74    68     0     0    38    48
+Qwen3-14B-Q4_K_M LS/P [bare]                  54.1%    63.2%    85.6%    95%   0.2  22.3s    50    100   100    14   100   100    92   100     0    64     0     0    34    46   100   100    20    88    98    50   100     0    68     0     0    16    16
+Qwen3-14B-Q4_K_M LS/P [bare:keep-last]        53.8%    63.2%    85.2%    95%   0.2  23.1s    50    100   100    20   100   100    94   100     0    60     0     2    30    56   100   100    18    88    98    38   100     0    62     0     0    24    10
+Qwen3-14B-Q4_K_M LS/P [bare:full]             53.9%    62.9%    85.8%    94%   0.2  24.4s    50    100   100    22   100   100    90   100     0    66     0     0    30    56   100   100    14    76   100    30   100     0    70     0     0    36    12
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## Qwen3-14B-Q4_K_M (llamaserver/prompt)
+## Nemotron-3-Nano-30B-A3B-Q4_K_M (llamaserver/native)
 
 ```
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                         Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3-14B-Q4_K_M LS/P [reforged]    70.5%    70.8%    99.7%    86%   0.5  24.2s    50    100   100   100   100   100    94   100    64    68     0     0    32    72   100   100   100   100    98    94   100    58    66     0     0    30    58
-Qwen3-14B-Q4_K_M LS/P [bare]        53.5%    62.7%    85.3%    96%   0.2  22.5s    50    100   100    18   100   100    92   100     0    60     0     0    34    50   100   100    16    84   100    56   100     0    62     0     2    10     6
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Nemotron-3-Nano-30B-A3B-Q4_K_M LS/N [reforged:full]²    71.3%    81.0%    88.0%    72%   1.5  21.4s    50    100   100   100   100   100    66    98    52    92    28     4    34    34   100   100   100    98   100    86    92    68    98    24     8    34    38
+Nemotron-3-Nano-30B-A3B-Q4_K_M LS/N [bare:full]²        37.5%    85.7%    43.7%    95%   0.2   6.6s    50     32   100    88    74    86    70     6     0    20    10     0    14     4    14   100    94    72    92    56     0     0    18    14     0    10     0
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## Qwen3-8B-Q4_K_M (llamaserver/prompt)
+## Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M (llamaserver/native)
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3-8B-Q4_K_M LS/P [reforged]    70.4%    70.7%    99.6%    86%   0.5  17.8s    50    100   100   100   100   100    94   100    56    64     0    14    12    92   100   100   100   100   100    94   100    62    58     0     0     6    78
-Qwen3-8B-Q4_K_M LS/P [bare]        57.7%    66.7%    86.5%    97%   0.2  17.3s    50    100   100    50    98   100    92   100     0    50     0    14    12    92   100   100    56    94   100    76   100     0    62     0     0     4     0
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                         Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/N [reforged:full]²    71.0%    71.2%    99.8%    96%   0.5   6.5s    50    100   100   100   100    98    58   100   100    68     4    12    20    90   100   100   100   100   100    50   100   100     4    22     0    20   100
+Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/N [bare:full]²        42.5%    67.4%    63.1%   100%   0.0   6.1s    50      0   100   100   100   100    68    10     0     4     4    12    12    98     0   100   100   100   100    56    20     0     0     0     2    20     0
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## Qwen3-8B-Q8_0 (llamaserver/native)
+## ministral-3:8b-instruct-2512-q8_0 (ollama/native)
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                      Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3-8B-Q8_0 LS/N [reforged]    70.3%    70.5%    99.7%    88%   0.6  24.1s    50    100   100   100   100   100   100   100    60    82     4    22    20    32   100   100   100   100    98    94   100    58    66     2    12    28    50
-Qwen3-8B-Q8_0 LS/N [bare]        46.6%    64.0%    72.8%   100%   0.1  20.5s    50     80    76     0    86    94    94    82     0    74     0    14    16     4    88    70     0    88    84    94    76     0    66     0     4    18     4
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ministral-3:8b-instruct-2512-q8_0 OL/N [reforged:full]²    70.7%    74.5%    94.9%    74%   1.1   5.9s    50    100   100   100   100   100   100   100   100    92    12    42     0    42   100   100   100   100   100   100   100   100    26     4     8     0    12
+ministral-3:8b-instruct-2512-q8_0 OL/N [bare:full]²        17.8%    49.6%    36.0%    93%   0.4   6.8s    50      0     0     0     0    68   100     0     0    14     0    20     0    96     0     0     0     0    64   100     0     0     0     0     2     0     0
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Nemotron-3-Nano-30B-A3B-Q4_K_M (llamaserver/prompt)
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                       Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Nemotron-3-Nano-30B-A3B-Q4_K_M LS/P [reforged]    70.2%    70.7%    99.4%    89%   0.4  10.8s    50    100   100   100   100    98    52   100    84    90     6     4     0   100   100   100   100   100   100    42   100    80    92     6     2     4    66
-Nemotron-3-Nano-30B-A3B-Q4_K_M LS/P [bare]        58.6%    65.1%    90.1%   100%   0.1  10.0s    50    100   100   100    98    98    48   100     0    88     2     4     0    98   100   100   100    94    96     6   100     0    88     2     0     2     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Nemotron-3-Nano-30B-A3B-Q4_K_M LS/P [reforged:full]²    70.2%    70.7%    99.4%    89%   0.4  10.8s    50    100   100   100   100    98    52   100    84    90     6     4     0   100   100   100   100   100   100    42   100    80    92     6     2     4    66
+Nemotron-3-Nano-30B-A3B-Q4_K_M LS/P [bare:full]²        58.6%    65.1%    90.1%   100%   0.1  10.0s    50    100   100   100    98    98    48   100     0    88     2     4     0    98   100   100   100    94    96     6   100     0    88     2     0     2     0
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## granite4.1:8b-q8_0 (ollama/native)
+## Qwen3-8B-Q8_0 (llamaserver/native)
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                           Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-granite4.1:8b-q8_0 OL/N [reforged]    69.2%    69.2%   100.0%    83%   1.1   2.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100   100   100     0     0     0     0
-granite4.1:8b-q8_0 OL/N [bare]        46.2%    60.0%    76.9%    95%   0.7   3.1s    50      0   100     0   100   100   100   100     0   100     0     0     0     0     0   100     0   100   100   100   100     0   100     0     0     0     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q8_0 LS/N [reforged]              68.2%    68.5%    99.5%    95%   0.3  24.8s    50    100   100   100   100   100   100   100    56    78     6    24    28     6    98   100   100   100   100   100   100    52    84     6     2    30     2
+Qwen3-8B-Q8_0 LS/N [reforged:keep-last]    67.0%    67.2%    99.8%    92%   0.4  23.2s    50    100   100   100   100   100    94    98    48    84     2    22    10    12   100   100   100   100   100    96   100    52    80     0    18    20     6
+Qwen3-8B-Q8_0 LS/N [reforged:full]         69.3%    69.6%    99.6%    88%   0.6  24.7s    50     98   100   100   100   100    94   100    48    76     2    28    24    46   100   100   100   100   100    98   100    46    76     2     8    16    40
+Qwen3-8B-Q8_0 LS/N [bare]                  50.4%    63.4%    79.5%   100%   0.1  23.1s    50     88   100     2    62    98   100    98     0    70     2    22    40     2    92   100     4    46    88    94    98     0    70     2     4    28     0
+Qwen3-8B-Q8_0 LS/N [bare:keep-last]        49.7%    65.4%    76.0%   100%   0.1  21.6s    50     86    78     2    88    88   100    80     0    64     2    18    18     8    90    88     2    84    94    96    86     0    92     0    12    16     0
+Qwen3-8B-Q8_0 LS/N [bare:full]             46.8%    63.0%    74.4%   100%   0.1  20.9s    50     84    70     0    84    96    96    82     0    62     0    12    14     8    88    70     2    98    88    84    86     0    74     0     4    16     0
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## Qwen3-8B-Q4_K_M (llamaserver/native)
+## granite4.1:8b-q8_0 (ollama/native)
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3-8B-Q4_K_M LS/N [reforged]    68.2%    68.4%    99.6%    86%   0.7  16.1s    50     98   100   100   100   100    92   100    48    78     0    44     8    38   100   100   100   100   100    90   100    40    76     0     8    14    38
-Qwen3-8B-Q4_K_M LS/N [bare]        44.6%    63.0%    70.8%   100%   0.1  13.8s    50     90    74     2    88    74    90    86     0    60     2    16    16     6    88    82     6    90    72    76    70     0    60     0     8     4     0
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                 Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+granite4.1:8b-q8_0 OL/N [reforged:full]²    69.2%    69.2%   100.0%    83%   1.1   2.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100   100   100     0     0     0     0
+granite4.1:8b-q8_0 OL/N [bare:full]²        46.2%    60.0%    76.9%    95%   0.7   3.1s    50      0   100     0   100   100   100   100     0   100     0     0     0     0     0   100     0   100   100   100   100     0   100     0     0     0     0
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## Qwen3-14B-Q4_K_M (llamaserver/native)
 
 ```
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                         Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3-14B-Q4_K_M LS/N [reforged]    67.7%    67.7%    99.9%    85%   0.9  20.8s    50    100   100   100   100   100    94   100    62    36     4    22    44    22   100   100   100   100    98    84   100    66    24    12    18    42    32
-Qwen3-14B-Q4_K_M LS/N [bare]        28.7%    50.1%    57.2%   100%   0.0  17.2s    50    100     4     0    24    46    62    30     0    20     6    10    44     0   100    12     8    54    54    68    34     0    20    14     4    32     0
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-14B-Q4_K_M LS/N [reforged]              68.5%    68.5%   100.0%   100%   0.4  21.8s    50    100   100   100   100    98   100   100    56    72    14     4    58     4   100   100   100   100   100   100   100    42    72    10     0    52     0
+Qwen3-14B-Q4_K_M LS/N [reforged:keep-last]    64.0%    64.0%    99.9%    91%   0.6  20.3s    50    100   100   100   100   100    90    98    48    40    18     6    38     4   100   100   100    98    96    88   100    54    30    16     2    38     0
+Qwen3-14B-Q4_K_M LS/N [reforged:full]         68.4%    68.4%    99.9%    83%   0.9  21.9s    50    100   100   100   100    98    90    98    60    32    20    18    50    38   100   100   100   100   100    86   100    74    34     6    18    34    22
+Qwen3-14B-Q4_K_M LS/N [bare]                  48.1%    60.0%    80.2%   100%   0.1  21.2s    50    100    96     0    38    66   100   100     0    62     8     0    48     2   100    98     0    42    56    94   100     0    72    12     0    56     0
+Qwen3-14B-Q4_K_M LS/N [bare:keep-last]        30.8%    52.1%    59.2%   100%   0.0  18.3s    50    100    28     6    24    60    72    34     0    12    16     4    34     0   100    12     8    46    52    60    46     0    24    28     6    30     0
+Qwen3-14B-Q4_K_M LS/N [bare:full]             27.5%    48.5%    56.8%   100%   0.0  16.5s    50     96    16     4    24    48    62    36     0    22     4     6    28     0   100    18     6    52    62    60    18     0    14     4     4    32     0
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## qwen3:8b-q8_0 (ollama/native)
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                      Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-qwen3:8b-q8_0 OL/N [reforged]    67.5%    67.6%    99.9%    85%   0.6  31.0s    50    100   100   100   100   100   100   100    26    88     0     2     6    66   100   100   100   100   100   100   100    40    82     0     0     2    44
-qwen3:8b-q8_0 OL/N [bare]        47.3%    56.8%    83.3%    96%   0.1  24.0s    50     86   100     2    34    92   100   100     0    64     0     4     6    16    88    98     2    68    92    98    92     0    84     0     0     0     4
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+qwen3:8b-q8_0 OL/N [reforged:full]²    67.5%    67.6%    99.9%    85%   0.6  31.0s    50    100   100   100   100   100   100   100    26    88     0     2     6    66   100   100   100   100   100   100   100    40    82     0     0     2    44
+qwen3:8b-q8_0 OL/N [bare:full]²        47.3%    56.8%    83.3%    96%   0.1  24.0s    50     86   100     2    34    92   100   100     0    64     0     4     6    16    88    98     2    68    92    98    92     0    84     0     0     0     4
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## ministral-3:8b-instruct-2512-q4_K_M (ollama/native)
+## Qwen3-8B-Q4_K_M (llamaserver/native)
 
 ```
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-ministral-3:8b-instruct-2512-q4_K_M OL/N [reforged]    66.8%    71.9%    92.9%    68%   1.4   5.4s    50    100   100   100   100    76   100   100    68    90     0     0     4    64   100   100   100   100    28   100   100    80    98     0     0     6    22
-ministral-3:8b-instruct-2512-q4_K_M OL/N [bare]        14.2%    45.1%    31.4%    91%   0.2   5.2s    50      0     0     0     0    76   100     0     0     4     0     0     2     8     0     0     0     0    86    70     0     0    22     0     0     0     0
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q4_K_M LS/N [reforged]              67.3%    67.5%    99.7%    96%   0.3  15.6s    50    100   100   100   100   100   100   100    40    98     6    14    22     2   100   100   100   100   100   100   100    48    86     2     0    26     6
+Qwen3-8B-Q4_K_M LS/N [reforged:keep-last]    64.5%    64.6%    99.9%    91%   0.4  15.0s    50    100   100   100   100   100   100   100    30    82     0    28    12    10   100   100   100    98   100    96   100    22    86     2     2     6     4
+Qwen3-8B-Q4_K_M LS/N [reforged:full]         65.8%    66.0%    99.7%    84%   0.7  17.2s    50    100   100   100   100   100    94   100    34    66     0    18    10    38   100   100   100    96   100    86   100    34    74     0    10    12    40
+Qwen3-8B-Q4_K_M LS/N [bare]                  53.2%    64.9%    82.1%   100%   0.1  15.0s    50     92   100     4    86    92   100   100     0    62     2    32    26    12    86   100     6    96    90   100    98     0    76     2     0    22     0
+Qwen3-8B-Q4_K_M LS/N [bare:keep-last]        46.4%    64.3%    72.2%   100%   0.1  15.0s    50     96    78     2    86    70    94    76     0    74     4    34     4     8    80    76     4    76    76    96    86     0    76     0     4     6     0
+Qwen3-8B-Q4_K_M LS/N [bare:full]             45.2%    63.7%    71.0%   100%   0.1  13.6s    50     92    80     2    86    76   100    80     0    74     0    14    12     2    88    68     4    74    82    94    70     0    60     0     4    14     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## granite-4.1-8b-Q4_K_M (llamaserver/native)
+## ministral-3:8b-instruct-2512-q4_K_M (ollama/native)
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-granite-4.1-8b-Q4_K_M LS/N [reforged]    65.4%    68.0%    96.2%    90%   0.8   1.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
-granite-4.1-8b-Q4_K_M LS/N [bare]        53.8%    70.0%    76.9%    96%   0.2   1.9s    50      0   100   100   100   100   100   100     0   100     0     0     0     0     0   100   100   100   100   100   100     0   100     0     0     0     0
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ministral-3:8b-instruct-2512-q4_K_M OL/N [reforged:full]²    66.8%    71.9%    92.9%    68%   1.4   5.4s    50    100   100   100   100    76   100   100    68    90     0     0     4    64   100   100   100   100    28   100   100    80    98     0     0     6    22
+ministral-3:8b-instruct-2512-q4_K_M OL/N [bare:full]²        14.2%    45.1%    31.4%    91%   0.2   5.2s    50      0     0     0     0    76   100     0     0     4     0     0     2     8     0     0     0     0    86    70     0     0    22     0     0     0     0
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## granite-4.1-8b-Q8_0 (llamaserver/native)
@@ -526,20 +591,35 @@ granite-4.1-8b-Q4_K_M LS/N [bare]        53.8%    70.0%    76.9%    96%   0.2
 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Model/Backend                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-granite-4.1-8b-Q8_0 LS/N [reforged]    65.4%    65.4%   100.0%    88%   1.4   2.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q8_0 LS/N [reforged]    65.4%    65.4%   100.0%    88%   1.3   2.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
 granite-4.1-8b-Q8_0 LS/N [bare]        46.2%    60.0%    77.0%    95%   1.1   3.2s    50      0   100     2   100   100   100   100     0   100     0     0     0     0     0   100     0   100   100   100   100     0   100     0     0     0     0
 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
+## granite-4.1-8b-Q4_K_M (llamaserver/native)
+
+```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+granite-4.1-8b-Q4_K_M LS/N [reforged]              65.4%    68.0%    96.2%    90%   0.8   1.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [reforged:keep-last]    65.4%    68.0%    96.2%    90%   0.8   1.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [reforged:full]         65.4%    68.0%    96.2%    90%   0.8   1.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [bare]                  53.8%    70.0%    76.9%    96%   0.2   1.9s    50      0   100   100   100   100   100   100     0   100     0     0     0     0     0   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [bare:keep-last]        53.8%    70.0%    76.9%    96%   0.2   1.9s    50      0   100   100   100   100   100   100     0   100     0     0     0     0     0   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [bare:full]             53.8%    70.0%    76.9%    96%   0.2   2.0s    50      0   100   100   100   100   100   100     0   100     0     0     0     0     0   100   100   100   100   100   100     0   100     0     0     0     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
 ## qwen3:8b-q4_K_M (ollama/native)
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-qwen3:8b-q4_K_M OL/N [reforged]    64.9%    65.1%    99.8%    85%   0.6  21.0s    50    100   100   100   100   100    96    98    30    62     2     6     2    74    98   100   100   100   100    98   100    26    70     4     0     4    18
-qwen3:8b-q4_K_M OL/N [bare]        40.7%    53.0%    76.8%    96%   0.1  15.8s    50     56    98     2     4   100    94   100     2    38     0     4     0    12    68   100     6    78    98    74    96     0    28     0     0     0     0
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+qwen3:8b-q4_K_M OL/N [reforged:full]²    64.9%    65.1%    99.8%    85%   0.6  21.0s    50    100   100   100   100   100    96    98    30    62     2     6     2    74    98   100   100   100   100    98   100    26    70     4     0     4    18
+qwen3:8b-q4_K_M OL/N [bare:full]²        40.7%    53.0%    76.8%    96%   0.1  15.8s    50     56    98     2     4   100    94   100     2    38     0     4     0    12    68   100     6    78    98    74    96     0    28     0     0     0     0
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## granite-4.1-8b-Q4_K_M (llamaserver/prompt)
@@ -560,26 +640,28 @@ granite-4.1-8b-Q4_K_M LS/P [bare]        46.2%    50.0%    92.3%   100%   0.0
 Model/Backend                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 granite-4.1-8b-Q8_0 LS/P [reforged]    61.5%    66.7%    92.3%    73%   1.0   5.2s    50      0   100   100   100   100   100   100   100     0     0     0   100     0     0   100   100   100   100   100   100   100     0     0     0   100     0
-granite-4.1-8b-Q8_0 LS/P [bare]        42.3%    50.0%    84.6%    86%   0.4   2.3s    50      0   100     0   100   100   100   100     0     0     0     0   100     0     0   100     0   100   100     0   100     0     0     0     0   100     0
+granite-4.1-8b-Q8_0 LS/P [bare]        42.3%    50.0%    84.6%    86%   0.4   2.4s    50      0   100     0   100   100   100   100     0     0     0     0   100     0     0   100     0   100   100     0   100     0     0     0     0   100     0
 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## granite4.1:8b-q4_K_M (ollama/native)
 
 ```
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-granite4.1:8b-q4_K_M OL/N [reforged]    57.8%    57.8%   100.0%    81%   1.3   1.9s    50    100   100   100   100   100   100   100   100     2     0     0     0     0   100   100   100   100   100   100   100     0     2     0     0     0     0
-granite4.1:8b-q4_K_M OL/N [bare]        38.6%    50.2%    76.9%    94%   1.0   2.1s    50      0   100     0   100   100   100   100     0     2     0     0     0     0     0   100     0   100   100   100   100     0     2     0     0     0     0
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+granite4.1:8b-q4_K_M OL/N [reforged:full]²    57.8%    57.8%   100.0%    81%   1.3   1.9s    50    100   100   100   100   100   100   100   100     2     0     0     0     0   100   100   100   100   100   100   100     0     2     0     0     0     0
+granite4.1:8b-q4_K_M OL/N [bare:full]²        38.6%    50.2%    76.9%    94%   1.0   2.1s    50      0   100     0   100   100   100   100     0     2     0     0     0     0     0   100     0   100   100   100   100     0     2     0     0     0     0
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 Scr=score(correct/total), Acc=accuracy(correct/total, excl validate errors), Cmp=completeness(completed/total), Eff=efficiency(ideal/actual calls), Wst=avg wasted calls, Spd=avg time(excl compaction)
 rel=relevance_detection, arg=argument_fidelity, tsl=tool_selection, b2s=basic_2step, s3s=sequential_3step, crt=conditional_routing, srn=sequential_reasoning, err=error_recovery, dgr=data_gap_recovery, dge=data_gap_recovery_extended, art=argument_transformation, grs=grounded_synthesis, iar=inconsistent_api_recovery, rel_s=relevance_detection_stateful, arg_s=argument_fidelity_stateful, tsl_s=tool_selection_stateful, b2s_s=basic_2step_stateful, s3s_s=sequential_3step_stateful, crt_s=conditional_routing_stateful, srn_s=sequential_reasoning_stateful, err_s=error_recovery_stateful, dgr_s=data_gap_recovery_stateful, dge_s=data_gap_recovery_extended_stateful, art_s=argument_transformation_stateful, grs_s=grounded_synthesis_stateful, iar_s=inconsistent_api_recovery_stateful
 Ablation: full=all guardrails, no_rescue=no rescue loop, no_nudge=no rescue/retry nudge, no_steps=no step enforcement, no_recovery=no error recovery, no_compact=no compaction, bare=all guardrails off
+Replay: ':keep-last'/':full' tags = reasoning_replay policy (how much captured reasoning is re-sent to the backend each turn); untagged = none (default). Rows predating the knob ran unbounded replay and count as full.
 
 Eval generations (older runs carried forward, superscript-tagged):
   ¹ gen 1 — v0.6.0 suite — incl. Anthropic ablation (commit 2b05dc4, 2026-05-08)
+  ² gen 2 — v0.7.0 lineup refresh (8–14B) + 32GB tier debut (v0.7.4) (commit 655e1f6, 2026-05-22)
 
-*Generated 2026-06-03 00:09*
+*Generated 2026-06-11 20:28*
diff --git a/docs/results/raw/reforged/all.md b/docs/results/raw/reforged/all.md
index ee89de6..437e7ab 100644
--- a/docs/results/raw/reforged/all.md
+++ b/docs/results/raw/reforged/all.md
@@ -1,69 +1,106 @@
 # Forge Eval — Reforged Leaderboard
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-claude-opus-4-6 AN/N [reforged]¹                              99.2%    99.8%    99.4%   100%   0.0  15.6s    50    100   100   100   100   100    98   100   100   100   100    98    94    98   100   100   100   100   100   100   100    96   100   100    98   100    98
-claude-sonnet-4-6 AN/N [reforged]¹                            98.4%    98.5%    99.9%   100%   0.1  13.1s    50    100   100   100   100   100   100   100   100   100    98    74    98   100   100   100   100   100   100   100   100   100   100   100    88   100   100
-claude-haiku-4-5-20251001 AN/N [reforged]¹                    94.5%    94.9%    99.6%   100%   0.3   8.5s    50    100   100   100   100   100   100   100   100   100    80    80    98   100   100   100   100   100   100   100    94   100   100    76    36    94   100
-Qwen3.6-35B-A3B-UD-Q4_K_M LS/N [reforged]                     94.8%    95.1%    99.7%   100%   0.6  12.7s    50    100   100   100   100   100   100    96   100   100    72    78    92   100    98   100   100   100   100    98    92   100   100    68    76    94   100
-Qwen3.5-27B-Q4_K_M LS/N [reforged]                            93.2%    93.3%    99.8%    82%   1.4  37.6s    50    100   100   100   100   100   100   100    98   100    74    38    88    98   100   100   100   100   100    98   100   100   100    78    56    96    98
-Qwen3.6-27B-Q4_K_M LS/N [reforged]                            92.2%    92.5%    99.6%   100%   0.4  37.9s    50    100   100   100   100   100   100   100    98   100    22    74    98   100   100   100   100   100   100   100   100    96    98    36    78    96   100
-Qwen3.5-35B-A3B-Q4_K_M LS/N [reforged]                        92.1%    92.4%    99.7%    82%   1.3  11.1s    50    100   100   100   100   100    96    98   100   100    96    14    84   100   100   100   100   100   100    96   100   100   100    94    20    96   100
-Qwen3.5-27B-Q4_K_M LS/P [reforged]                            86.8%    86.8%   100.0%   100%   0.1  24.4s    50    100   100   100   100   100   100   100   100   100    42    10    78   100   100   100   100   100   100   100   100   100   100    36    10    80   100
-Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged]         84.5%    84.5%   100.0%    97%   0.6   5.4s    50    100   100   100   100   100    88   100   100    70    44    48    76    94   100   100   100   100   100    96    98   100    76    38    26    62    82
-Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged]            84.2%    84.2%   100.0%    95%   0.5   6.0s    50    100   100   100   100   100   100   100   100    98    74    26    54    88   100   100   100   100   100   100   100   100   100    68     2    26    52
-Ministral-3-8B-Instruct-2512-Q8_0 LS/P [reforged]             84.4%    91.1%    92.6%    92%   0.7   4.6s    50    100   100     6   100   100   100   100   100   100    98     8   100    80   100   100     4   100   100   100   100   100   100   100     0    98   100
-Qwen3.5-35B-A3B-Q4_K_M LS/P [reforged]                        82.8%    82.8%   100.0%   100%   0.2  10.4s    50     48   100   100   100   100    94    98   100   100    74    16    62    90    56   100   100   100   100    96   100   100    98    68    14    58    82
-Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged]          82.8%    82.8%    99.9%    95%   0.5   4.1s    50    100   100   100   100   100   100   100   100    98    66    24    34    92   100   100   100   100   100    96   100   100   100    70     0    30    42
-Qwen3.6-27B-Q4_K_M LS/P [reforged]                            83.5%    85.0%    98.2%    97%   0.4  53.9s    50    100   100   100   100   100   100   100   100    98     6    66    52    90   100   100   100   100   100    98   100    96    90     2    56    36    80
-Qwen3.6-35B-A3B-UD-Q4_K_M LS/P [reforged]                     82.2%    82.2%   100.0%   100%   0.3  23.6s    50     96   100   100   100   100    90    92    98    92    16    46    62    98    88   100   100   100   100    88    94    96    88     8    42    50    94
-Ministral-3-8B-Instruct-2512-Q8_0 LS/N [reforged]             81.4%    81.4%   100.0%   100%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100    38     4     0   100   100   100   100   100   100   100   100   100   100    74     0     0   100
-Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged]            81.3%    85.0%    95.7%    96%   0.7   3.9s    50    100   100    96   100   100    98   100    92    98    56    14    46    70   100   100    88   100   100    98   100    94   100    82     0    48    34
-Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [reforged]          80.2%    80.2%   100.0%   100%   0.0   2.9s    50    100   100   100   100   100   100   100   100   100     0     0    36   100   100   100   100   100   100   100   100   100   100     0     0    50   100
-Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged]          80.5%    83.0%    97.0%    96%   0.7   2.7s    50    100    98    98   100   100    98    98   100    96    74    10    36    70   100    98    96   100   100    98   100    98    94    70     2    38    20
-qwen3:14b-q4_K_M OL/N [reforged]                              78.6%    78.7%    99.9%    77%   1.2  38.5s    50    100   100   100   100   100   100   100   100    74     4    12    68    78   100   100   100   100   100   100   100    94    88     4     0    54    68
-Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged]         79.5%    80.5%    98.7%    96%   0.5   3.7s    50    100   100   100   100   100    82   100   100    78    30     6    58    92   100   100   100   100   100    74   100   100    80    20     6    56    84
-Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [reforged]          78.1%    78.1%   100.0%    97%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100    16     0     0   100   100   100   100   100   100   100   100   100   100    14     0     0   100
-Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [reforged]           78.3%    78.4%    99.8%    95%   0.4   3.2s    50    100   100   100   100   100   100   100    98   100    22     0     0   100   100   100   100   100   100   100   100    98   100    14     2     2   100
-gemma-4-E4B-it-Q4_K_M LS/N [reforged]                         78.2%    82.2%    95.1%    98%   0.5   9.0s    50    100   100   100   100   100    92    98    98    90     0    24    80    50   100   100   100   100   100    94    90    94    98     0     0    84    40
-Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [reforged]    78.2%    84.3%    92.7%    78%   1.1   3.6s    50    100   100   100   100   100   100   100   100    28     0     0   100    94   100   100   100   100   100   100   100   100    34     0     0   100    76
-gemma-4-E4B-it-Q8_0 LS/N [reforged]                           76.2%    80.7%    94.5%    98%   0.6  12.8s    50    100   100   100   100   100    84    88    90    96     2    14    80    44   100   100   100   100   100    88    90    96    94     4     0    80    32
-Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [reforged]           75.6%    83.8%    90.2%    79%   1.3   3.0s    50     98   100     0   100   100   100   100   100   100    22    12    56   100   100   100     0   100   100   100   100   100   100    28     0    50   100
-gemma-4-E4B-it-Q8_0 LS/P [reforged]                           74.7%    74.7%   100.0%    85%   0.6  12.7s    50    100   100   100   100   100    70   100    90    88     0    16    34    94   100   100   100   100   100    48   100    98    84     0     0    30    90
-gemma4:e4b-it-q4_K_M OL/N [reforged]                          74.8%    75.0%    99.8%    83%   0.8  11.3s    50    100   100   100   100   100    94   100   100    92     0     0    44    66   100   100   100   100   100    90   100   100    78     0     0    40    42
-ministral-3:14b-instruct-2512-q4_K_M OL/N [reforged]          74.8%    74.8%   100.0%    81%   1.0   6.2s    50    100   100   100   100   100   100   100   100    96    56     0     0    80   100   100   100   100   100   100   100   100    98     0     0     0    16
-gemma4:e4b-it-q8_0 OL/N [reforged]                            73.6%    73.8%    99.8%    85%   0.8  12.8s    50    100   100   100   100   100    78    98   100   100     0     8    34    60   100   100   100   100   100    78    94   100    96     0     0    34    34
-Qwen3-8B-Q8_0 LS/P [reforged]                                 73.1%    73.2%    99.8%    89%   0.4  28.4s    50    100   100   100   100   100   100   100    58    96     0     8    28    94   100   100   100   100    96   100    98    64    88     0     0    12    58
-phi-4-Q4_K_M LS/P [reforged]                                  72.9%    73.3%    99.5%    85%   0.9   4.1s    50    100   100   100   100   100    34    56    96    90    52    24    38    70   100   100   100   100   100    24    62    92    98    52     0    60    48
-gemma-4-E4B-it-Q4_K_M LS/P [reforged]                         72.8%    72.8%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    38    98    94    94     0    18    26    96   100   100   100   100   100    26    98   100    92     0     2    22    90
-Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/N [reforged]    71.0%    71.2%    99.8%    96%   0.5   6.5s    50    100   100   100   100    98    58   100   100    68     4    12    20    90   100   100   100   100   100    50   100   100     4    22     0    20   100
-Qwen3-14B-Q4_K_M LS/P [reforged]                              70.5%    70.8%    99.7%    86%   0.5  24.2s    50    100   100   100   100   100    94   100    64    68     0     0    32    72   100   100   100   100    98    94   100    58    66     0     0    30    58
-ministral-3:8b-instruct-2512-q8_0 OL/N [reforged]             70.7%    74.5%    94.9%    74%   1.1   5.9s    50    100   100   100   100   100   100   100   100    92    12    42     0    42   100   100   100   100   100   100   100   100    26     4     8     0    12
-Nemotron-3-Nano-30B-A3B-Q4_K_M LS/N [reforged]                71.3%    81.0%    88.0%    72%   1.5  21.4s    50    100   100   100   100   100    66    98    52    92    28     4    34    34   100   100   100    98   100    86    92    68    98    24     8    34    38
-Qwen3-8B-Q8_0 LS/N [reforged]                                 70.3%    70.5%    99.7%    88%   0.6  24.1s    50    100   100   100   100   100   100   100    60    82     4    22    20    32   100   100   100   100    98    94   100    58    66     2    12    28    50
-Qwen3-8B-Q4_K_M LS/P [reforged]                               70.4%    70.7%    99.6%    86%   0.5  17.8s    50    100   100   100   100   100    94   100    56    64     0    14    12    92   100   100   100   100   100    94   100    62    58     0     0     6    78
-Nemotron-3-Nano-30B-A3B-Q4_K_M LS/P [reforged]                70.2%    70.7%    99.4%    89%   0.4  10.8s    50    100   100   100   100    98    52   100    84    90     6     4     0   100   100   100   100   100   100    42   100    80    92     6     2     4    66
-granite4.1:8b-q8_0 OL/N [reforged]                            69.2%    69.2%   100.0%    83%   1.1   2.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100   100   100     0     0     0     0
-Qwen3-8B-Q4_K_M LS/N [reforged]                               68.2%    68.4%    99.6%    86%   0.7  16.1s    50     98   100   100   100   100    92   100    48    78     0    44     8    38   100   100   100   100   100    90   100    40    76     0     8    14    38
-Qwen3-14B-Q4_K_M LS/N [reforged]                              67.7%    67.7%    99.9%    85%   0.9  20.8s    50    100   100   100   100   100    94   100    62    36     4    22    44    22   100   100   100   100    98    84   100    66    24    12    18    42    32
-qwen3:8b-q8_0 OL/N [reforged]                                 67.5%    67.6%    99.9%    85%   0.6  31.0s    50    100   100   100   100   100   100   100    26    88     0     2     6    66   100   100   100   100   100   100   100    40    82     0     0     2    44
-ministral-3:8b-instruct-2512-q4_K_M OL/N [reforged]           66.8%    71.9%    92.9%    68%   1.4   5.4s    50    100   100   100   100    76   100   100    68    90     0     0     4    64   100   100   100   100    28   100   100    80    98     0     0     6    22
-granite-4.1-8b-Q8_0 LS/N [reforged]                           65.4%    65.4%   100.0%    88%   1.4   2.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
-qwen3:8b-q4_K_M OL/N [reforged]                               64.9%    65.1%    99.8%    85%   0.6  21.0s    50    100   100   100   100   100    96    98    30    62     2     6     2    74    98   100   100   100   100    98   100    26    70     4     0     4    18
-granite-4.1-8b-Q4_K_M LS/N [reforged]                         65.4%    68.0%    96.2%    90%   0.8   1.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
-granite-4.1-8b-Q4_K_M LS/P [reforged]                         61.5%    61.5%   100.0%    90%   0.3   2.5s    50    100   100   100   100   100     0   100   100     0     0   100     0     0   100   100   100   100   100     0   100   100     0     0   100     0     0
-granite-4.1-8b-Q8_0 LS/P [reforged]                           61.5%    66.7%    92.3%    73%   1.0   5.2s    50      0   100   100   100   100   100   100   100     0     0     0   100     0     0   100   100   100   100   100   100   100     0     0     0   100     0
-granite4.1:8b-q4_K_M OL/N [reforged]                          57.8%    57.8%   100.0%    81%   1.3   1.9s    50    100   100   100   100   100   100   100   100     2     0     0     0     0   100   100   100   100   100   100   100     0     2     0     0     0     0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                         Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+claude-opus-4-8 AN/N [reforged]                                    100.0%   100.0%   100.0%   100%   0.0  13.3s    50    100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100
+claude-sonnet-4-6 AN/N [reforged]                                  100.0%   100.0%   100.0%   100%   0.0  18.2s    50    100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100
+claude-opus-4-6 AN/N [reforged:full]¹                               99.2%    99.8%    99.4%   100%   0.0  15.6s    50    100   100   100   100   100    98   100   100   100   100    98    94    98   100   100   100   100   100   100   100    96   100   100    98   100    98
+Qwen3.6-35B-A3B-UD-Q4_K_M LS/N [reforged:full]²                     94.8%    95.1%    99.7%   100%   0.6  12.7s    50    100   100   100   100   100   100    96   100   100    72    78    92   100    98   100   100   100   100    98    92   100   100    68    76    94   100
+claude-haiku-4-5-20251001 AN/N [reforged]                           94.2%    94.2%    99.9%   100%   0.3   6.6s    50    100   100   100   100   100   100    98   100   100    74    74    98   100   100   100   100   100   100   100   100   100   100    72    38    94   100
+Qwen3.5-27B-Q4_K_M LS/N [reforged:full]²                            93.2%    93.3%    99.8%    82%   1.4  37.6s    50    100   100   100   100   100   100   100    98   100    74    38    88    98   100   100   100   100   100    98   100   100   100    78    56    96    98
+Qwen3.6-27B-Q4_K_M LS/N [reforged:full]²                            92.2%    92.5%    99.6%   100%   0.4  37.9s    50    100   100   100   100   100   100   100    98   100    22    74    98   100   100   100   100   100   100   100   100    96    98    36    78    96   100
+Qwen3.5-35B-A3B-Q4_K_M LS/N [reforged:full]²                        92.1%    92.4%    99.7%    82%   1.3  11.1s    50    100   100   100   100   100    96    98   100   100    96    14    84   100   100   100   100   100   100    96   100   100   100    94    20    96   100
+Qwen3.5-27B-Q4_K_M LS/P [reforged:full]²                            86.8%    86.8%   100.0%   100%   0.1  24.4s    50    100   100   100   100   100   100   100   100   100    42    10    78   100   100   100   100   100   100   100   100   100   100    36    10    80   100
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged]                  84.5%    84.8%    99.7%    96%   0.6   5.3s    50    100   100    96   100   100    98   100   100   100    76    18    44    92   100   100   100   100   100   100    98   100    98    80     2    42    54
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged:keep-last]        84.8%    85.6%    99.1%    94%   0.6   5.9s    50    100   100   100   100   100   100   100   100    98    70    24    42    86   100   100   100   100   100   100   100   100   100    82     6    36    62
+Ministral-3-8B-Instruct-2512-Q8_0 LS/P [reforged]                   84.2%    90.9%    92.7%    91%   0.7   4.6s    50    100   100     6   100   100   100   100   100   100    98     8   100    80   100   100     4   100   100   100   100   100   100    96     0    98   100
+Qwen3.5-35B-A3B-Q4_K_M LS/P [reforged:full]²                        82.8%    82.8%   100.0%   100%   0.2  10.4s    50     48   100   100   100   100    94    98   100   100    74    16    62    90    56   100   100   100   100    96   100   100    98    68    14    58    82
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged:full]          83.2%    83.2%   100.0%    97%   0.6   5.6s    50    100   100   100   100   100    86    98   100    68    40    32    76    96   100   100   100   100   100    92   100   100    62    34    30    68    82
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged]               83.3%    83.3%   100.0%    96%   0.6   4.8s    50    100   100   100   100   100   100   100   100    60    32    34    78    94   100   100   100   100   100    92   100   100    62    30    28    78    78
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged:full]             83.1%    83.1%    99.9%    96%   0.5   6.0s    50    100   100   100   100   100   100    98   100   100    66    20    36    88   100   100   100   100   100   100   100   100    98    74     2    26    52
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged:keep-last]     82.9%    82.9%   100.0%    95%   0.6   5.0s    50    100   100   100   100   100    92   100   100    68    40    30    60    94   100   100   100   100   100    96   100   100    62    36    20    72    86
+Qwen3.6-27B-Q4_K_M LS/P [reforged:full]²                            83.5%    85.0%    98.2%    97%   0.4  53.9s    50    100   100   100   100   100   100   100   100    98     6    66    52    90   100   100   100   100   100    98   100    96    90     2    56    36    80
+Qwen3.6-35B-A3B-UD-Q4_K_M LS/P [reforged:full]²                     82.2%    82.2%   100.0%   100%   0.3  23.6s    50     96   100   100   100   100    90    92    98    92    16    46    62    98    88   100   100   100   100    88    94    96    88     8    42    50    94
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged:full]           81.8%    81.8%   100.0%    95%   0.5   4.3s    50    100   100    98   100    98    98    98   100    98    62     8    30    96   100   100    98   100   100    98    96   100    98    62     2    40    46
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged:keep-last]      82.4%    83.0%    99.3%    92%   0.6   4.2s    50    100   100   100   100   100    98   100   100   100    68    16    28    86   100   100   100   100   100    96    98   100    96    64     6    24    62
+Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [reforged]                80.6%    80.6%   100.0%   100%   0.0   3.0s    50    100   100   100   100   100   100   100   100   100     0     0    46   100   100   100   100   100   100   100   100   100    98     0     0    52   100
+Ministral-3-8B-Instruct-2512-Q8_0 LS/N [reforged]                   81.0%    81.0%   100.0%   100%   0.3   4.1s    50    100   100   100   100   100   100   100   100   100    30     0     4   100   100   100   100   100   100   100   100   100   100    68     0     4   100
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged]                81.4%    81.6%    99.7%    94%   0.6   3.8s    50    100   100    98   100   100   100   100   100   100    68    24    18    86   100   100    98   100   100   100   100   100    96    70     4    28    26
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged:full]          81.2%    82.0%    98.9%    98%   0.6   3.8s    50    100   100   100   100   100    74   100   100    68    40     4    72    88   100   100   100   100   100    82   100   100    78    38    10    64    92
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged:full]           81.0%    83.1%    97.5%    95%   0.7   3.0s    50    100    98    88   100   100    96   100    98    98    84    24    38    70   100   100   100   100   100    96    98   100    96    72     2    22    26
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged:keep-last]        81.2%    84.6%    96.0%    97%   0.7   4.1s    50    100   100    94   100   100    94   100    92    98    72    12    42    78   100   100    86   100   100    98   100    98   100    64     0    52    32
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged:full]             81.4%    84.8%    96.0%    95%   0.7   4.0s    50    100   100    88   100   100    94   100    96   100    72    12    58    80   100    98    88   100   100    96   100    92   100    64     2    36    40
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged]                  80.9%    84.8%    95.4%    95%   0.7   3.9s    50    100   100    90   100   100    98   100    98   100    72     6    50    76   100   100    86   100   100   100   100    92    96    64     2    42    32
+gemma-4-E4B-it-Q4_K_M LS/N [reforged]                               79.7%    79.9%    99.8%   100%   0.3   8.1s    50    100   100   100   100   100    94    98   100    84     8    30    64    92   100   100   100   100   100    88    94   100    86     2     0    48    84
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged:keep-last]      79.8%    82.6%    96.7%    95%   0.6   2.8s    50    100   100    96   100   100    98    98    94    98    66     8    36    64   100   100    96   100   100   100    98    96    94    76     0    34    24
+qwen3:14b-q4_K_M OL/N [reforged:full]²                              78.6%    78.7%    99.9%    77%   1.2  38.5s    50    100   100   100   100   100   100   100   100    74     4    12    68    78   100   100   100   100   100   100   100    94    88     4     0    54    68
+gemma-4-E4B-it-Q4_K_M LS/N [reforged:keep-last]                     78.7%    81.4%    96.7%    99%   0.5   9.3s    50    100   100   100   100   100    96    92    96    96     6    20    64    62   100   100   100   100   100    88    88   100    98     2     0    82    56
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged]                79.5%    81.9%    97.0%    94%   0.7   2.8s    50    100   100    98   100   100    96   100    90    96    68     4    32    68   100    98    98   100   100    96    96    94    98    70     0    34    30
+gemma-4-E4B-it-Q4_K_M LS/N [reforged:full]                          79.2%    82.2%    96.3%    99%   0.5  10.0s    50    100   100   100   100   100    96    88    94   100     2    40    78    54   100   100   100   100   100    94    84    98    96     0     0    82    52
+gemma-4-E4B-it-Q8_0 LS/N [reforged]                                 77.8%    77.9%    99.8%   100%   0.2  10.8s    50    100   100   100   100   100    76    90   100   100     0    18    38    98   100   100   100   100   100    76    98   100    98     0     0    36    94
+Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [reforged]                77.8%    77.8%   100.0%    97%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100     8     0     0   100   100   100   100   100   100   100   100   100   100    14     0     0   100
+Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [reforged]                 78.3%    78.3%   100.0%    95%   0.4   3.2s    50    100   100   100   100   100   100   100   100   100    18     0     0   100   100   100   100   100   100   100   100   100   100    16     2     0   100
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged]               77.7%    78.8%    98.5%    95%   0.6   3.8s    50    100   100   100   100   100    74   100   100    78    32     2    46    90   100   100   100   100   100    66   100   100    74    28     4    48    78
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged:keep-last]     78.0%    78.8%    98.9%    95%   0.6   3.8s    50    100   100   100   100   100    82   100   100    70    28     2    52    78   100   100   100   100   100    74   100   100    80    18     4    48    92
+Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [reforged:full]²    78.2%    84.3%    92.7%    78%   1.1   3.6s    50    100   100   100   100   100   100   100   100    28     0     0   100    94   100   100   100   100   100   100   100   100    34     0     0   100    76
+gemma-4-E4B-it-Q8_0 LS/N [reforged:keep-last]                       75.7%    79.3%    95.5%    99%   0.5  13.1s    50    100   100   100   100   100    76    92    98    96     4    16    82    48   100   100   100   100   100    50    86    98    94     2     0    86    40
+gemma-4-E4B-it-Q8_0 LS/N [reforged:full]                            75.6%    80.8%    93.6%    98%   0.6  12.5s    50    100   100   100   100   100    92    94    92    88     0    18    82    28   100   100   100   100   100    76    92    94    98     2     0    84    26
+phi-4-Q4_K_M LS/P [reforged]                                        75.3%    75.4%    99.8%    83%   0.9   4.2s    50    100   100   100   100   100    26    62    94    96    62    34    66    70   100   100   100   100   100    28    84    98    94    42     0    60    42
+gemma4:e4b-it-q4_K_M OL/N [reforged:full]²                          74.8%    75.0%    99.8%    83%   0.8  11.3s    50    100   100   100   100   100    94   100   100    92     0     0    44    66   100   100   100   100   100    90   100   100    78     0     0    40    42
+ministral-3:14b-instruct-2512-q4_K_M OL/N [reforged:full]²          74.8%    74.8%   100.0%    81%   1.0   6.2s    50    100   100   100   100   100   100   100   100    96    56     0     0    80   100   100   100   100   100   100   100   100    98     0     0     0    16
+Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [reforged]                 74.9%    83.3%    89.9%    79%   1.3   3.1s    50    100   100     0   100   100   100   100   100   100    20     2    42   100   100   100     0   100   100   100   100   100   100    22     0    62   100
+gemma-4-E4B-it-Q8_0 LS/P [reforged:full]                            73.7%    73.7%   100.0%    86%   0.6  12.7s    50    100   100   100   100   100    48   100    92    90     0    36    28    88   100   100   100   100   100    48    94    98    82     0     0    20    92
+gemma4:e4b-it-q8_0 OL/N [reforged:full]²                            73.6%    73.8%    99.8%    85%   0.8  12.8s    50    100   100   100   100   100    78    98   100   100     0     8    34    60   100   100   100   100   100    78    94   100    96     0     0    34    34
+gemma-4-E4B-it-Q8_0 LS/P [reforged:keep-last]                       74.1%    74.2%    99.8%    85%   0.6  13.4s    50    100   100   100   100   100    48   100    96    84     0    22    28    94   100   100   100   100   100    52   100    98    90     0     0    18    96
+Qwen3-8B-Q8_0 LS/P [reforged:keep-last]                             72.8%    73.0%    99.7%    89%   0.4  28.0s    50    100   100   100   100   100    98    98    80    90     0     6     8    94   100   100   100   100    96    96   100    60    98     0     2    12    54
+Qwen3-8B-Q8_0 LS/P [reforged:full]                                  72.8%    72.9%    99.8%    88%   0.4  28.9s    50    100   100   100   100   100    98   100    70    90     0     4    20    96   100   100   100   100    92   100    96    66    92     0     0    12    56
+gemma-4-E4B-it-Q4_K_M LS/P [reforged:keep-last]                     73.2%    73.2%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    40    98    96    92     0    10    24    96   100   100   100   100   100    38   100   100    86     0     0    32    92
+gemma-4-E4B-it-Q4_K_M LS/P [reforged:full]                          72.9%    72.9%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    54    98    98    86     0    10    28    88   100   100   100   100   100    26    98    96    94     0     0    30    90
+gemma-4-E4B-it-Q8_0 LS/P [reforged]                                 73.2%    73.3%    99.8%    85%   0.6  13.3s    50    100   100   100   100   100    54    98    94    80     0    28    20    94   100   100   100   100   100    40   100    98    90     0     0    12    96
+Qwen3-8B-Q4_K_M LS/P [reforged:keep-last]                           72.2%    72.3%    99.9%    88%   0.5  17.9s    50    100   100   100   100   100   100   100    62    66     0    30     8    90   100   100   100   100   100    98   100    74    68     0     0     8    74
+Qwen3-8B-Q8_0 LS/P [reforged]                                       72.0%    72.3%    99.6%    88%   0.4  28.6s    50    100   100   100   100   100    96   100    56    90     0     2    10    96   100   100   100   100    96    98   100    58    88     0     0    20    62
+Qwen3-14B-Q4_K_M LS/P [reforged:full]                               71.8%    71.9%    99.8%    87%   0.5  24.3s    50    100   100   100   100   100    96   100    72    72     2     0    30    74   100   100   100   100   100    92   100    74    68     0     0    38    48
+Qwen3-14B-Q4_K_M LS/P [reforged:keep-last]                          71.8%    71.8%   100.0%    86%   0.5  23.8s    50    100   100   100   100   100    98   100    72    58     2     4    28    72   100   100   100   100   100    94   100    76    74     0     0    32    56
+gemma-4-E4B-it-Q4_K_M LS/P [reforged]                               72.4%    72.4%    99.9%    85%   0.6   8.9s    50    100   100   100   100   100    56    96   100    86     0    14    28    84   100   100   100   100   100    34   100    96    74     0     0    22    92
+Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/N [reforged:full]²    71.0%    71.2%    99.8%    96%   0.5   6.5s    50    100   100   100   100    98    58   100   100    68     4    12    20    90   100   100   100   100   100    50   100   100     4    22     0    20   100
+Qwen3-8B-Q4_K_M LS/P [reforged:full]                                70.5%    70.8%    99.6%    88%   0.4  17.4s    50    100   100   100   100   100    88   100    58    66     0    24    10    88   100   100   100   100   100    94    98    62    66     0     0     4    76
+Qwen3-8B-Q4_K_M LS/P [reforged]                                     71.1%    71.2%    99.8%    87%   0.5  18.0s    50    100   100   100   100   100    96   100    70    66     0    18     8    96   100   100   100   100    98    96   100    60    60     0     0    10    70
+Qwen3-14B-Q4_K_M LS/P [reforged]                                    71.4%    71.4%    99.9%    86%   0.5  25.6s    50    100   100   100   100   100    98   100    70    70     0     0    30    80   100   100   100   100   100    92   100    76    56     0     0    22    62
+ministral-3:8b-instruct-2512-q8_0 OL/N [reforged:full]²             70.7%    74.5%    94.9%    74%   1.1   5.9s    50    100   100   100   100   100   100   100   100    92    12    42     0    42   100   100   100   100   100   100   100   100    26     4     8     0    12
+Nemotron-3-Nano-30B-A3B-Q4_K_M LS/N [reforged:full]²                71.3%    81.0%    88.0%    72%   1.5  21.4s    50    100   100   100   100   100    66    98    52    92    28     4    34    34   100   100   100    98   100    86    92    68    98    24     8    34    38
+Nemotron-3-Nano-30B-A3B-Q4_K_M LS/P [reforged:full]²                70.2%    70.7%    99.4%    89%   0.4  10.8s    50    100   100   100   100    98    52   100    84    90     6     4     0   100   100   100   100   100   100    42   100    80    92     6     2     4    66
+Qwen3-14B-Q4_K_M LS/N [reforged]                                    68.5%    68.5%   100.0%   100%   0.4  21.8s    50    100   100   100   100    98   100   100    56    72    14     4    58     4   100   100   100   100   100   100   100    42    72    10     0    52     0
+Qwen3-8B-Q8_0 LS/N [reforged:full]                                  69.3%    69.6%    99.6%    88%   0.6  24.7s    50     98   100   100   100   100    94   100    48    76     2    28    24    46   100   100   100   100   100    98   100    46    76     2     8    16    40
+granite4.1:8b-q8_0 OL/N [reforged:full]²                            69.2%    69.2%   100.0%    83%   1.1   2.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100   100   100     0     0     0     0
+qwen3:8b-q8_0 OL/N [reforged:full]²                                 67.5%    67.6%    99.9%    85%   0.6  31.0s    50    100   100   100   100   100   100   100    26    88     0     2     6    66   100   100   100   100   100   100   100    40    82     0     0     2    44
+Qwen3-14B-Q4_K_M LS/N [reforged:full]                               68.4%    68.4%    99.9%    83%   0.9  21.9s    50    100   100   100   100    98    90    98    60    32    20    18    50    38   100   100   100   100   100    86   100    74    34     6    18    34    22
+Qwen3-8B-Q8_0 LS/N [reforged]                                       68.2%    68.5%    99.5%    95%   0.3  24.8s    50    100   100   100   100   100   100   100    56    78     6    24    28     6    98   100   100   100   100   100   100    52    84     6     2    30     2
+Qwen3-8B-Q4_K_M LS/N [reforged]                                     67.3%    67.5%    99.7%    96%   0.3  15.6s    50    100   100   100   100   100   100   100    40    98     6    14    22     2   100   100   100   100   100   100   100    48    86     2     0    26     6
+Qwen3-8B-Q8_0 LS/N [reforged:keep-last]                             67.0%    67.2%    99.8%    92%   0.4  23.2s    50    100   100   100   100   100    94    98    48    84     2    22    10    12   100   100   100   100   100    96   100    52    80     0    18    20     6
+ministral-3:8b-instruct-2512-q4_K_M OL/N [reforged:full]²           66.8%    71.9%    92.9%    68%   1.4   5.4s    50    100   100   100   100    76   100   100    68    90     0     0     4    64   100   100   100   100    28   100   100    80    98     0     0     6    22
+Qwen3-8B-Q4_K_M LS/N [reforged:full]                                65.8%    66.0%    99.7%    84%   0.7  17.2s    50    100   100   100   100   100    94   100    34    66     0    18    10    38   100   100   100    96   100    86   100    34    74     0    10    12    40
+Qwen3-8B-Q4_K_M LS/N [reforged:keep-last]                           64.5%    64.6%    99.9%    91%   0.4  15.0s    50    100   100   100   100   100   100   100    30    82     0    28    12    10   100   100   100    98   100    96   100    22    86     2     2     6     4
+granite-4.1-8b-Q8_0 LS/N [reforged]                                 65.4%    65.4%   100.0%    88%   1.3   2.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+qwen3:8b-q4_K_M OL/N [reforged:full]²                               64.9%    65.1%    99.8%    85%   0.6  21.0s    50    100   100   100   100   100    96    98    30    62     2     6     2    74    98   100   100   100   100    98   100    26    70     4     0     4    18
+granite-4.1-8b-Q4_K_M LS/N [reforged:keep-last]                     65.4%    68.0%    96.2%    90%   0.8   1.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [reforged:full]                          65.4%    68.0%    96.2%    90%   0.8   1.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [reforged]                               65.4%    68.0%    96.2%    90%   0.8   1.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+Qwen3-14B-Q4_K_M LS/N [reforged:keep-last]                          64.0%    64.0%    99.9%    91%   0.6  20.3s    50    100   100   100   100   100    90    98    48    40    18     6    38     4   100   100   100    98    96    88   100    54    30    16     2    38     0
+granite-4.1-8b-Q4_K_M LS/P [reforged]                               61.5%    61.5%   100.0%    90%   0.3   2.5s    50    100   100   100   100   100     0   100   100     0     0   100     0     0   100   100   100   100   100     0   100   100     0     0   100     0     0
+granite-4.1-8b-Q8_0 LS/P [reforged]                                 61.5%    66.7%    92.3%    73%   1.0   5.2s    50      0   100   100   100   100   100   100   100     0     0     0   100     0     0   100   100   100   100   100   100   100     0     0     0   100     0
+granite4.1:8b-q4_K_M OL/N [reforged:full]²                          57.8%    57.8%   100.0%    81%   1.3   1.9s    50    100   100   100   100   100   100   100   100     2     0     0     0     0   100   100   100   100   100   100   100     0     2     0     0     0     0
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 Scr=score(correct/total), Acc=accuracy(correct/total, excl validate errors), Cmp=completeness(completed/total), Eff=efficiency(ideal/actual calls), Wst=avg wasted calls, Spd=avg time(excl compaction)
 rel=relevance_detection, arg=argument_fidelity, tsl=tool_selection, b2s=basic_2step, s3s=sequential_3step, crt=conditional_routing, srn=sequential_reasoning, err=error_recovery, dgr=data_gap_recovery, dge=data_gap_recovery_extended, art=argument_transformation, grs=grounded_synthesis, iar=inconsistent_api_recovery, rel_s=relevance_detection_stateful, arg_s=argument_fidelity_stateful, tsl_s=tool_selection_stateful, b2s_s=basic_2step_stateful, s3s_s=sequential_3step_stateful, crt_s=conditional_routing_stateful, srn_s=sequential_reasoning_stateful, err_s=error_recovery_stateful, dgr_s=data_gap_recovery_stateful, dge_s=data_gap_recovery_extended_stateful, art_s=argument_transformation_stateful, grs_s=grounded_synthesis_stateful, iar_s=inconsistent_api_recovery_stateful
 Ablation: full=all guardrails, no_rescue=no rescue loop, no_nudge=no rescue/retry nudge, no_steps=no step enforcement, no_recovery=no error recovery, no_compact=no compaction, bare=all guardrails off
+Replay: ':keep-last'/':full' tags = reasoning_replay policy (how much captured reasoning is re-sent to the backend each turn); untagged = none (default). Rows predating the knob ran unbounded replay and count as full.
 
 Eval generations (older runs carried forward, superscript-tagged):
   ¹ gen 1 — v0.6.0 suite — incl. Anthropic ablation (commit 2b05dc4, 2026-05-08)
+  ² gen 2 — v0.7.0 lineup refresh (8–14B) + 32GB tier debut (v0.7.4) (commit 655e1f6, 2026-05-22)
 
-*Generated 2026-06-03 00:09*
+*Generated 2026-06-11 20:28*
diff --git a/docs/results/raw/reforged/by-backend.md b/docs/results/raw/reforged/by-backend.md
index 59ee6e2..6314fd5 100644
--- a/docs/results/raw/reforged/by-backend.md
+++ b/docs/results/raw/reforged/by-backend.md
@@ -3,128 +3,152 @@
 ## ministral-3-8b-instruct-q8_0
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                          Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Instruct-2512-Q8_0 LS/P [reforged]    84.4%    91.1%    92.6%    92%   0.7   4.6s    50    100   100     6   100   100   100   100   100   100    98     8   100    80   100   100     4   100   100   100   100   100   100   100     0    98   100
-Ministral-3-8B-Instruct-2512-Q8_0 LS/N [reforged]    81.4%    81.4%   100.0%   100%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100    38     4     0   100   100   100   100   100   100   100   100   100   100    74     0     0   100
-ministral-3:8b-instruct-2512-q8_0 OL/N [reforged]    70.7%    74.5%    94.9%    74%   1.1   5.9s    50    100   100   100   100   100   100   100   100    92    12    42     0    42   100   100   100   100   100   100   100   100    26     4     8     0    12
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Instruct-2512-Q8_0 LS/P [reforged]          84.2%    90.9%    92.7%    91%   0.7   4.6s    50    100   100     6   100   100   100   100   100   100    98     8   100    80   100   100     4   100   100   100   100   100   100    96     0    98   100
+Ministral-3-8B-Instruct-2512-Q8_0 LS/N [reforged]          81.0%    81.0%   100.0%   100%   0.3   4.1s    50    100   100   100   100   100   100   100   100   100    30     0     4   100   100   100   100   100   100   100   100   100   100    68     0     4   100
+ministral-3:8b-instruct-2512-q8_0 OL/N [reforged:full]²    70.7%    74.5%    94.9%    74%   1.1   5.9s    50    100   100   100   100   100   100   100   100    92    12    42     0    42   100   100   100   100   100   100   100   100    26     4     8     0    12
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## ministral-3-14b-instruct-q4_K_M
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [reforged]    80.2%    80.2%   100.0%   100%   0.0   2.9s    50    100   100   100   100   100   100   100   100   100     0     0    36   100   100   100   100   100   100   100   100   100   100     0     0    50   100
-Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [reforged]    78.1%    78.1%   100.0%    97%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100    16     0     0   100   100   100   100   100   100   100   100   100   100    14     0     0   100
-ministral-3:14b-instruct-2512-q4_K_M OL/N [reforged]    74.8%    74.8%   100.0%    81%   1.0   6.2s    50    100   100   100   100   100   100   100   100    96    56     0     0    80   100   100   100   100   100   100   100   100    98     0     0     0    16
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [reforged]          80.6%    80.6%   100.0%   100%   0.0   3.0s    50    100   100   100   100   100   100   100   100   100     0     0    46   100   100   100   100   100   100   100   100   100    98     0     0    52   100
+Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [reforged]          77.8%    77.8%   100.0%    97%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100     8     0     0   100   100   100   100   100   100   100   100   100   100    14     0     0   100
+ministral-3:14b-instruct-2512-q4_K_M OL/N [reforged:full]²    74.8%    74.8%   100.0%    81%   1.0   6.2s    50    100   100   100   100   100   100   100   100    96    56     0     0    80   100   100   100   100   100   100   100   100    98     0     0     0    16
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## qwen3-14b-q4_K_M
+## gemma4-e4b-q4_K_M
 
 ```
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                         Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-qwen3:14b-q4_K_M OL/N [reforged]    78.6%    78.7%    99.9%    77%   1.2  38.5s    50    100   100   100   100   100   100   100   100    74     4    12    68    78   100   100   100   100   100   100   100    94    88     4     0    54    68
-Qwen3-14B-Q4_K_M LS/P [reforged]    70.5%    70.8%    99.7%    86%   0.5  24.2s    50    100   100   100   100   100    94   100    64    68     0     0    32    72   100   100   100   100    98    94   100    58    66     0     0    30    58
-Qwen3-14B-Q4_K_M LS/N [reforged]    67.7%    67.7%    99.9%    85%   0.9  20.8s    50    100   100   100   100   100    94   100    62    36     4    22    44    22   100   100   100   100    98    84   100    66    24    12    18    42    32
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q4_K_M LS/N [reforged]              79.7%    79.9%    99.8%   100%   0.3   8.1s    50    100   100   100   100   100    94    98   100    84     8    30    64    92   100   100   100   100   100    88    94   100    86     2     0    48    84
+gemma-4-E4B-it-Q4_K_M LS/N [reforged:full]         79.2%    82.2%    96.3%    99%   0.5  10.0s    50    100   100   100   100   100    96    88    94   100     2    40    78    54   100   100   100   100   100    94    84    98    96     0     0    82    52
+gemma-4-E4B-it-Q4_K_M LS/N [reforged:keep-last]    78.7%    81.4%    96.7%    99%   0.5   9.3s    50    100   100   100   100   100    96    92    96    96     6    20    64    62   100   100   100   100   100    88    88   100    98     2     0    82    56
+gemma4:e4b-it-q4_K_M OL/N [reforged:full]²         74.8%    75.0%    99.8%    83%   0.8  11.3s    50    100   100   100   100   100    94   100   100    92     0     0    44    66   100   100   100   100   100    90   100   100    78     0     0    40    42
+gemma-4-E4B-it-Q4_K_M LS/P [reforged:keep-last]    73.2%    73.2%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    40    98    96    92     0    10    24    96   100   100   100   100   100    38   100   100    86     0     0    32    92
+gemma-4-E4B-it-Q4_K_M LS/P [reforged:full]         72.9%    72.9%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    54    98    98    86     0    10    28    88   100   100   100   100   100    26    98    96    94     0     0    30    90
+gemma-4-E4B-it-Q4_K_M LS/P [reforged]              72.4%    72.4%    99.9%    85%   0.6   8.9s    50    100   100   100   100   100    56    96   100    86     0    14    28    84   100   100   100   100   100    34   100    96    74     0     0    22    92
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## ministral-3-8b-instruct-q4_K_M
+## qwen3-14b-q4_K_M
 
 ```
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [reforged]    78.3%    78.4%    99.8%    95%   0.4   3.2s    50    100   100   100   100   100   100   100    98   100    22     0     0   100   100   100   100   100   100   100   100    98   100    14     2     2   100
-Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [reforged]    75.6%    83.8%    90.2%    79%   1.3   3.0s    50     98   100     0   100   100   100   100   100   100    22    12    56   100   100   100     0   100   100   100   100   100   100    28     0    50   100
-ministral-3:8b-instruct-2512-q4_K_M OL/N [reforged]    66.8%    71.9%    92.9%    68%   1.4   5.4s    50    100   100   100   100    76   100   100    68    90     0     0     4    64   100   100   100   100    28   100   100    80    98     0     0     6    22
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+qwen3:14b-q4_K_M OL/N [reforged:full]²        78.6%    78.7%    99.9%    77%   1.2  38.5s    50    100   100   100   100   100   100   100   100    74     4    12    68    78   100   100   100   100   100   100   100    94    88     4     0    54    68
+Qwen3-14B-Q4_K_M LS/P [reforged:keep-last]    71.8%    71.8%   100.0%    86%   0.5  23.8s    50    100   100   100   100   100    98   100    72    58     2     4    28    72   100   100   100   100   100    94   100    76    74     0     0    32    56
+Qwen3-14B-Q4_K_M LS/P [reforged:full]         71.8%    71.9%    99.8%    87%   0.5  24.3s    50    100   100   100   100   100    96   100    72    72     2     0    30    74   100   100   100   100   100    92   100    74    68     0     0    38    48
+Qwen3-14B-Q4_K_M LS/P [reforged]              71.4%    71.4%    99.9%    86%   0.5  25.6s    50    100   100   100   100   100    98   100    70    70     0     0    30    80   100   100   100   100   100    92   100    76    56     0     0    22    62
+Qwen3-14B-Q4_K_M LS/N [reforged]              68.5%    68.5%   100.0%   100%   0.4  21.8s    50    100   100   100   100    98   100   100    56    72    14     4    58     4   100   100   100   100   100   100   100    42    72    10     0    52     0
+Qwen3-14B-Q4_K_M LS/N [reforged:full]         68.4%    68.4%    99.9%    83%   0.9  21.9s    50    100   100   100   100    98    90    98    60    32    20    18    50    38   100   100   100   100   100    86   100    74    34     6    18    34    22
+Qwen3-14B-Q4_K_M LS/N [reforged:keep-last]    64.0%    64.0%    99.9%    91%   0.6  20.3s    50    100   100   100   100   100    90    98    48    40    18     6    38     4   100   100   100    98    96    88   100    54    30    16     2    38     0
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## gemma4-e4b-q4_K_M
+## ministral-3-8b-instruct-q4_K_M
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-gemma-4-E4B-it-Q4_K_M LS/N [reforged]    78.2%    82.2%    95.1%    98%   0.5   9.0s    50    100   100   100   100   100    92    98    98    90     0    24    80    50   100   100   100   100   100    94    90    94    98     0     0    84    40
-gemma4:e4b-it-q4_K_M OL/N [reforged]     74.8%    75.0%    99.8%    83%   0.8  11.3s    50    100   100   100   100   100    94   100   100    92     0     0    44    66   100   100   100   100   100    90   100   100    78     0     0    40    42
-gemma-4-E4B-it-Q4_K_M LS/P [reforged]    72.8%    72.8%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    38    98    94    94     0    18    26    96   100   100   100   100   100    26    98   100    92     0     2    22    90
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [reforged]          78.3%    78.3%   100.0%    95%   0.4   3.2s    50    100   100   100   100   100   100   100   100   100    18     0     0   100   100   100   100   100   100   100   100   100   100    16     2     0   100
+Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [reforged]          74.9%    83.3%    89.9%    79%   1.3   3.1s    50    100   100     0   100   100   100   100   100   100    20     2    42   100   100   100     0   100   100   100   100   100   100    22     0    62   100
+ministral-3:8b-instruct-2512-q4_K_M OL/N [reforged:full]²    66.8%    71.9%    92.9%    68%   1.4   5.4s    50    100   100   100   100    76   100   100    68    90     0     0     4    64   100   100   100   100    28   100   100    80    98     0     0     6    22
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## gemma4-e4b-q8_0
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-gemma-4-E4B-it-Q8_0 LS/N [reforged]    76.2%    80.7%    94.5%    98%   0.6  12.8s    50    100   100   100   100   100    84    88    90    96     2    14    80    44   100   100   100   100   100    88    90    96    94     4     0    80    32
-gemma-4-E4B-it-Q8_0 LS/P [reforged]    74.7%    74.7%   100.0%    85%   0.6  12.7s    50    100   100   100   100   100    70   100    90    88     0    16    34    94   100   100   100   100   100    48   100    98    84     0     0    30    90
-gemma4:e4b-it-q8_0 OL/N [reforged]     73.6%    73.8%    99.8%    85%   0.8  12.8s    50    100   100   100   100   100    78    98   100   100     0     8    34    60   100   100   100   100   100    78    94   100    96     0     0    34    34
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                      Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q8_0 LS/N [reforged]              77.8%    77.9%    99.8%   100%   0.2  10.8s    50    100   100   100   100   100    76    90   100   100     0    18    38    98   100   100   100   100   100    76    98   100    98     0     0    36    94
+gemma-4-E4B-it-Q8_0 LS/N [reforged:keep-last]    75.7%    79.3%    95.5%    99%   0.5  13.1s    50    100   100   100   100   100    76    92    98    96     4    16    82    48   100   100   100   100   100    50    86    98    94     2     0    86    40
+gemma-4-E4B-it-Q8_0 LS/N [reforged:full]         75.6%    80.8%    93.6%    98%   0.6  12.5s    50    100   100   100   100   100    92    94    92    88     0    18    82    28   100   100   100   100   100    76    92    94    98     2     0    84    26
+gemma-4-E4B-it-Q8_0 LS/P [reforged:keep-last]    74.1%    74.2%    99.8%    85%   0.6  13.4s    50    100   100   100   100   100    48   100    96    84     0    22    28    94   100   100   100   100   100    52   100    98    90     0     0    18    96
+gemma-4-E4B-it-Q8_0 LS/P [reforged:full]         73.7%    73.7%   100.0%    86%   0.6  12.7s    50    100   100   100   100   100    48   100    92    90     0    36    28    88   100   100   100   100   100    48    94    98    82     0     0    20    92
+gemma4:e4b-it-q8_0 OL/N [reforged:full]²         73.6%    73.8%    99.8%    85%   0.8  12.8s    50    100   100   100   100   100    78    98   100   100     0     8    34    60   100   100   100   100   100    78    94   100    96     0     0    34    34
+gemma-4-E4B-it-Q8_0 LS/P [reforged]              73.2%    73.3%    99.8%    85%   0.6  13.3s    50    100   100   100   100   100    54    98    94    80     0    28    20    94   100   100   100   100   100    40   100    98    90     0     0    12    96
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## qwen3-8b-q8_0
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                      Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3-8B-Q8_0 LS/P [reforged]    73.1%    73.2%    99.8%    89%   0.4  28.4s    50    100   100   100   100   100   100   100    58    96     0     8    28    94   100   100   100   100    96   100    98    64    88     0     0    12    58
-Qwen3-8B-Q8_0 LS/N [reforged]    70.3%    70.5%    99.7%    88%   0.6  24.1s    50    100   100   100   100   100   100   100    60    82     4    22    20    32   100   100   100   100    98    94   100    58    66     2    12    28    50
-qwen3:8b-q8_0 OL/N [reforged]    67.5%    67.6%    99.9%    85%   0.6  31.0s    50    100   100   100   100   100   100   100    26    88     0     2     6    66   100   100   100   100   100   100   100    40    82     0     0     2    44
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q8_0 LS/P [reforged:keep-last]    72.8%    73.0%    99.7%    89%   0.4  28.0s    50    100   100   100   100   100    98    98    80    90     0     6     8    94   100   100   100   100    96    96   100    60    98     0     2    12    54
+Qwen3-8B-Q8_0 LS/P [reforged:full]         72.8%    72.9%    99.8%    88%   0.4  28.9s    50    100   100   100   100   100    98   100    70    90     0     4    20    96   100   100   100   100    92   100    96    66    92     0     0    12    56
+Qwen3-8B-Q8_0 LS/P [reforged]              72.0%    72.3%    99.6%    88%   0.4  28.6s    50    100   100   100   100   100    96   100    56    90     0     2    10    96   100   100   100   100    96    98   100    58    88     0     0    20    62
+Qwen3-8B-Q8_0 LS/N [reforged:full]         69.3%    69.6%    99.6%    88%   0.6  24.7s    50     98   100   100   100   100    94   100    48    76     2    28    24    46   100   100   100   100   100    98   100    46    76     2     8    16    40
+Qwen3-8B-Q8_0 LS/N [reforged]              68.2%    68.5%    99.5%    95%   0.3  24.8s    50    100   100   100   100   100   100   100    56    78     6    24    28     6    98   100   100   100   100   100   100    52    84     6     2    30     2
+qwen3:8b-q8_0 OL/N [reforged:full]²        67.5%    67.6%    99.9%    85%   0.6  31.0s    50    100   100   100   100   100   100   100    26    88     0     2     6    66   100   100   100   100   100   100   100    40    82     0     0     2    44
+Qwen3-8B-Q8_0 LS/N [reforged:keep-last]    67.0%    67.2%    99.8%    92%   0.4  23.2s    50    100   100   100   100   100    94    98    48    84     2    22    10    12   100   100   100   100   100    96   100    52    80     0    18    20     6
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## qwen3-8b-q4_K_M
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3-8B-Q4_K_M LS/P [reforged]    70.4%    70.7%    99.6%    86%   0.5  17.8s    50    100   100   100   100   100    94   100    56    64     0    14    12    92   100   100   100   100   100    94   100    62    58     0     0     6    78
-Qwen3-8B-Q4_K_M LS/N [reforged]    68.2%    68.4%    99.6%    86%   0.7  16.1s    50     98   100   100   100   100    92   100    48    78     0    44     8    38   100   100   100   100   100    90   100    40    76     0     8    14    38
-qwen3:8b-q4_K_M OL/N [reforged]    64.9%    65.1%    99.8%    85%   0.6  21.0s    50    100   100   100   100   100    96    98    30    62     2     6     2    74    98   100   100   100   100    98   100    26    70     4     0     4    18
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q4_K_M LS/P [reforged:keep-last]    72.2%    72.3%    99.9%    88%   0.5  17.9s    50    100   100   100   100   100   100   100    62    66     0    30     8    90   100   100   100   100   100    98   100    74    68     0     0     8    74
+Qwen3-8B-Q4_K_M LS/P [reforged]              71.1%    71.2%    99.8%    87%   0.5  18.0s    50    100   100   100   100   100    96   100    70    66     0    18     8    96   100   100   100   100    98    96   100    60    60     0     0    10    70
+Qwen3-8B-Q4_K_M LS/P [reforged:full]         70.5%    70.8%    99.6%    88%   0.4  17.4s    50    100   100   100   100   100    88   100    58    66     0    24    10    88   100   100   100   100   100    94    98    62    66     0     0     4    76
+Qwen3-8B-Q4_K_M LS/N [reforged]              67.3%    67.5%    99.7%    96%   0.3  15.6s    50    100   100   100   100   100   100   100    40    98     6    14    22     2   100   100   100   100   100   100   100    48    86     2     0    26     6
+Qwen3-8B-Q4_K_M LS/N [reforged:full]         65.8%    66.0%    99.7%    84%   0.7  17.2s    50    100   100   100   100   100    94   100    34    66     0    18    10    38   100   100   100    96   100    86   100    34    74     0    10    12    40
+qwen3:8b-q4_K_M OL/N [reforged:full]²        64.9%    65.1%    99.8%    85%   0.6  21.0s    50    100   100   100   100   100    96    98    30    62     2     6     2    74    98   100   100   100   100    98   100    26    70     4     0     4    18
+Qwen3-8B-Q4_K_M LS/N [reforged:keep-last]    64.5%    64.6%    99.9%    91%   0.4  15.0s    50    100   100   100   100   100   100   100    30    82     0    28    12    10   100   100   100    98   100    96   100    22    86     2     2     6     4
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## granite-4.1-8b-q8_0
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                            Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-granite4.1:8b-q8_0 OL/N [reforged]     69.2%    69.2%   100.0%    83%   1.1   2.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100   100   100     0     0     0     0
-granite-4.1-8b-Q8_0 LS/N [reforged]    65.4%    65.4%   100.0%    88%   1.4   2.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
-granite-4.1-8b-Q8_0 LS/P [reforged]    61.5%    66.7%    92.3%    73%   1.0   5.2s    50      0   100   100   100   100   100   100   100     0     0     0   100     0     0   100   100   100   100   100   100   100     0     0     0   100     0
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                 Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+granite4.1:8b-q8_0 OL/N [reforged:full]²    69.2%    69.2%   100.0%    83%   1.1   2.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100   100   100     0     0     0     0
+granite-4.1-8b-Q8_0 LS/N [reforged]         65.4%    65.4%   100.0%    88%   1.3   2.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q8_0 LS/P [reforged]         61.5%    66.7%    92.3%    73%   1.0   5.2s    50      0   100   100   100   100   100   100   100     0     0     0   100     0     0   100   100   100   100   100   100   100     0     0     0   100     0
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## granite-4.1-8b-q4_K_M
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-granite-4.1-8b-Q4_K_M LS/N [reforged]    65.4%    68.0%    96.2%    90%   0.8   1.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
-granite-4.1-8b-Q4_K_M LS/P [reforged]    61.5%    61.5%   100.0%    90%   0.3   2.5s    50    100   100   100   100   100     0   100   100     0     0   100     0     0   100   100   100   100   100     0   100   100     0     0   100     0     0
-granite4.1:8b-q4_K_M OL/N [reforged]     57.8%    57.8%   100.0%    81%   1.3   1.9s    50    100   100   100   100   100   100   100   100     2     0     0     0     0   100   100   100   100   100   100   100     0     2     0     0     0     0
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+granite-4.1-8b-Q4_K_M LS/N [reforged]              65.4%    68.0%    96.2%    90%   0.8   1.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [reforged:keep-last]    65.4%    68.0%    96.2%    90%   0.8   1.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [reforged:full]         65.4%    68.0%    96.2%    90%   0.8   1.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/P [reforged]              61.5%    61.5%   100.0%    90%   0.3   2.5s    50    100   100   100   100   100     0   100   100     0     0   100     0     0   100   100   100   100   100     0   100   100     0     0   100     0     0
+granite4.1:8b-q4_K_M OL/N [reforged:full]²         57.8%    57.8%   100.0%    81%   1.3   1.9s    50    100   100   100   100   100   100   100   100     2     0     0     0     0   100   100   100   100   100   100   100     0     2     0     0     0     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 Scr=score(correct/total), Acc=accuracy(correct/total, excl validate errors), Cmp=completeness(completed/total), Eff=efficiency(ideal/actual calls), Wst=avg wasted calls, Spd=avg time(excl compaction)
 rel=relevance_detection, arg=argument_fidelity, tsl=tool_selection, b2s=basic_2step, s3s=sequential_3step, crt=conditional_routing, srn=sequential_reasoning, err=error_recovery, dgr=data_gap_recovery, dge=data_gap_recovery_extended, art=argument_transformation, grs=grounded_synthesis, iar=inconsistent_api_recovery, rel_s=relevance_detection_stateful, arg_s=argument_fidelity_stateful, tsl_s=tool_selection_stateful, b2s_s=basic_2step_stateful, s3s_s=sequential_3step_stateful, crt_s=conditional_routing_stateful, srn_s=sequential_reasoning_stateful, err_s=error_recovery_stateful, dgr_s=data_gap_recovery_stateful, dge_s=data_gap_recovery_extended_stateful, art_s=argument_transformation_stateful, grs_s=grounded_synthesis_stateful, iar_s=inconsistent_api_recovery_stateful
 Ablation: full=all guardrails, no_rescue=no rescue loop, no_nudge=no rescue/retry nudge, no_steps=no step enforcement, no_recovery=no error recovery, no_compact=no compaction, bare=all guardrails off
+Replay: ':keep-last'/':full' tags = reasoning_replay policy (how much captured reasoning is re-sent to the backend each turn); untagged = none (default). Rows predating the knob ran unbounded replay and count as full.
 
 Eval generations (older runs carried forward, superscript-tagged):
   ¹ gen 1 — v0.6.0 suite — incl. Anthropic ablation (commit 2b05dc4, 2026-05-08)
+  ² gen 2 — v0.7.0 lineup refresh (8–14B) + 32GB tier debut (v0.7.4) (commit 655e1f6, 2026-05-22)
 
-*Generated 2026-06-03 00:09*
+*Generated 2026-06-11 20:28*
diff --git a/docs/results/raw/reforged/by-family.md b/docs/results/raw/reforged/by-family.md
index 315d025..a06cc09 100644
--- a/docs/results/raw/reforged/by-family.md
+++ b/docs/results/raw/reforged/by-family.md
@@ -2,144 +2,154 @@
 
 ## claude
 
-```
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-claude-opus-4-6 AN/N [reforged]¹              99.2%    99.8%    99.4%   100%   0.0  15.6s    50    100   100   100   100   100    98   100   100   100   100    98    94    98   100   100   100   100   100   100   100    96   100   100    98   100    98
-claude-sonnet-4-6 AN/N [reforged]¹            98.4%    98.5%    99.9%   100%   0.1  13.1s    50    100   100   100   100   100   100   100   100   100    98    74    98   100   100   100   100   100   100   100   100   100   100   100    88   100   100
-claude-haiku-4-5-20251001 AN/N [reforged]¹    94.5%    94.9%    99.6%   100%   0.3   8.5s    50    100   100   100   100   100   100   100   100   100    80    80    98   100   100   100   100   100   100   100    94   100   100    76    36    94   100
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-```
-
-## qwen3.6-35b-a3b
-
 ```
 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Model/Backend                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.6-35B-A3B-UD-Q4_K_M LS/N [reforged]    94.8%    95.1%    99.7%   100%   0.6  12.7s    50    100   100   100   100   100   100    96   100   100    72    78    92   100    98   100   100   100   100    98    92   100   100    68    76    94   100
-Qwen3.6-35B-A3B-UD-Q4_K_M LS/P [reforged]    82.2%    82.2%   100.0%   100%   0.3  23.6s    50     96   100   100   100   100    90    92    98    92    16    46    62    98    88   100   100   100   100    88    94    96    88     8    42    50    94
+claude-sonnet-4-6 AN/N [reforged]           100.0%   100.0%   100.0%   100%   0.0  18.2s    50    100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100
+claude-opus-4-8 AN/N [reforged]             100.0%   100.0%   100.0%   100%   0.0  13.3s    50    100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100   100
+claude-opus-4-6 AN/N [reforged:full]¹        99.2%    99.8%    99.4%   100%   0.0  15.6s    50    100   100   100   100   100    98   100   100   100   100    98    94    98   100   100   100   100   100   100   100    96   100   100    98   100    98
+claude-haiku-4-5-20251001 AN/N [reforged]    94.2%    94.2%    99.9%   100%   0.3   6.6s    50    100   100   100   100   100   100    98   100   100    74    74    98   100   100   100   100   100   100   100   100   100   100    72    38    94   100
 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## qwen3.5-27b
+## qwen3.6-35b-a3b
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                           Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.5-27B-Q4_K_M LS/N [reforged]    93.2%    93.3%    99.8%    82%   1.4  37.6s    50    100   100   100   100   100   100   100    98   100    74    38    88    98   100   100   100   100   100    98   100   100   100    78    56    96    98
-Qwen3.5-27B-Q4_K_M LS/P [reforged]    86.8%    86.8%   100.0%   100%   0.1  24.4s    50    100   100   100   100   100   100   100   100   100    42    10    78   100   100   100   100   100   100   100   100   100   100    36    10    80   100
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.6-35B-A3B-UD-Q4_K_M LS/N [reforged:full]²    94.8%    95.1%    99.7%   100%   0.6  12.7s    50    100   100   100   100   100   100    96   100   100    72    78    92   100    98   100   100   100   100    98    92   100   100    68    76    94   100
+Qwen3.6-35B-A3B-UD-Q4_K_M LS/P [reforged:full]²    82.2%    82.2%   100.0%   100%   0.3  23.6s    50     96   100   100   100   100    90    92    98    92    16    46    62    98    88   100   100   100   100    88    94    96    88     8    42    50    94
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## qwen3.6-27b
+## qwen3.5-27b
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                           Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.6-27B-Q4_K_M LS/N [reforged]    92.2%    92.5%    99.6%   100%   0.4  37.9s    50    100   100   100   100   100   100   100    98   100    22    74    98   100   100   100   100   100   100   100   100    96    98    36    78    96   100
-Qwen3.6-27B-Q4_K_M LS/P [reforged]    83.5%    85.0%    98.2%    97%   0.4  53.9s    50    100   100   100   100   100   100   100   100    98     6    66    52    90   100   100   100   100   100    98   100    96    90     2    56    36    80
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                 Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.5-27B-Q4_K_M LS/N [reforged:full]²    93.2%    93.3%    99.8%    82%   1.4  37.6s    50    100   100   100   100   100   100   100    98   100    74    38    88    98   100   100   100   100   100    98   100   100   100    78    56    96    98
+Qwen3.5-27B-Q4_K_M LS/P [reforged:full]²    86.8%    86.8%   100.0%   100%   0.1  24.4s    50    100   100   100   100   100   100   100   100   100    42    10    78   100   100   100   100   100   100   100   100   100   100    36    10    80   100
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## qwen3.5-35b-a3b
+## qwen3.6-27b
 
 ```
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                               Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3.5-35B-A3B-Q4_K_M LS/N [reforged]    92.1%    92.4%    99.7%    82%   1.3  11.1s    50    100   100   100   100   100    96    98   100   100    96    14    84   100   100   100   100   100   100    96   100   100   100    94    20    96   100
-Qwen3.5-35B-A3B-Q4_K_M LS/P [reforged]    82.8%    82.8%   100.0%   100%   0.2  10.4s    50     48   100   100   100   100    94    98   100   100    74    16    62    90    56   100   100   100   100    96   100   100    98    68    14    58    82
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                 Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.6-27B-Q4_K_M LS/N [reforged:full]²    92.2%    92.5%    99.6%   100%   0.4  37.9s    50    100   100   100   100   100   100   100    98   100    22    74    98   100   100   100   100   100   100   100   100    96    98    36    78    96   100
+Qwen3.6-27B-Q4_K_M LS/P [reforged:full]²    83.5%    85.0%    98.2%    97%   0.4  53.9s    50    100   100   100   100   100   100   100   100    98     6    66    52    90   100   100   100   100   100    98   100    96    90     2    56    36    80
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## ministral-14b
+## qwen3.5-35b-a3b
 
 ```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged]    84.5%    84.5%   100.0%    97%   0.6   5.4s    50    100   100   100   100   100    88   100   100    70    44    48    76    94   100   100   100   100   100    96    98   100    76    38    26    62    82
-Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [reforged]     80.2%    80.2%   100.0%   100%   0.0   2.9s    50    100   100   100   100   100   100   100   100   100     0     0    36   100   100   100   100   100   100   100   100   100   100     0     0    50   100
-Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged]    79.5%    80.5%    98.7%    96%   0.5   3.7s    50    100   100   100   100   100    82   100   100    78    30     6    58    92   100   100   100   100   100    74   100   100    80    20     6    56    84
-Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [reforged]     78.1%    78.1%   100.0%    97%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100    16     0     0   100   100   100   100   100   100   100   100   100   100    14     0     0   100
-ministral-3:14b-instruct-2512-q4_K_M OL/N [reforged]     74.8%    74.8%   100.0%    81%   1.0   6.2s    50    100   100   100   100   100   100   100   100    96    56     0     0    80   100   100   100   100   100   100   100   100    98     0     0     0    16
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                     Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3.5-35B-A3B-Q4_K_M LS/N [reforged:full]²    92.1%    92.4%    99.7%    82%   1.3  11.1s    50    100   100   100   100   100    96    98   100   100    96    14    84   100   100   100   100   100   100    96   100   100   100    94    20    96   100
+Qwen3.5-35B-A3B-Q4_K_M LS/P [reforged:full]²    82.8%    82.8%   100.0%   100%   0.2  10.4s    50     48   100   100   100   100    94    98   100   100    74    16    62    90    56   100   100   100   100    96   100   100    98    68    14    58    82
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## ministral-8b
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Ministral-3-8B-Instruct-2512-Q8_0 LS/P [reforged]       84.4%    91.1%    92.6%    92%   0.7   4.6s    50    100   100     6   100   100   100   100   100   100    98     8   100    80   100   100     4   100   100   100   100   100   100   100     0    98   100
-Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged]      84.2%    84.2%   100.0%    95%   0.5   6.0s    50    100   100   100   100   100   100   100   100    98    74    26    54    88   100   100   100   100   100   100   100   100   100    68     2    26    52
-Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged]    82.8%    82.8%    99.9%    95%   0.5   4.1s    50    100   100   100   100   100   100   100   100    98    66    24    34    92   100   100   100   100   100    96   100   100   100    70     0    30    42
-Ministral-3-8B-Instruct-2512-Q8_0 LS/N [reforged]       81.4%    81.4%   100.0%   100%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100    38     4     0   100   100   100   100   100   100   100   100   100   100    74     0     0   100
-Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged]      81.3%    85.0%    95.7%    96%   0.7   3.9s    50    100   100    96   100   100    98   100    92    98    56    14    46    70   100   100    88   100   100    98   100    94   100    82     0    48    34
-Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged]    80.5%    83.0%    97.0%    96%   0.7   2.7s    50    100    98    98   100   100    98    98   100    96    74    10    36    70   100    98    96   100   100    98   100    98    94    70     2    38    20
-Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [reforged]     78.3%    78.4%    99.8%    95%   0.4   3.2s    50    100   100   100   100   100   100   100    98   100    22     0     0   100   100   100   100   100   100   100   100    98   100    14     2     2   100
-Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [reforged]     75.6%    83.8%    90.2%    79%   1.3   3.0s    50     98   100     0   100   100   100   100   100   100    22    12    56   100   100   100     0   100   100   100   100   100   100    28     0    50   100
-ministral-3:8b-instruct-2512-q8_0 OL/N [reforged]       70.7%    74.5%    94.9%    74%   1.1   5.9s    50    100   100   100   100   100   100   100   100    92    12    42     0    42   100   100   100   100   100   100   100   100    26     4     8     0    12
-ministral-3:8b-instruct-2512-q4_K_M OL/N [reforged]     66.8%    71.9%    92.9%    68%   1.4   5.4s    50    100   100   100   100    76   100   100    68    90     0     0     4    64   100   100   100   100    28   100   100    80    98     0     0     6    22
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                       Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged:keep-last]      84.8%    85.6%    99.1%    94%   0.6   5.9s    50    100   100   100   100   100   100   100   100    98    70    24    42    86   100   100   100   100   100   100   100   100   100    82     6    36    62
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged]                84.5%    84.8%    99.7%    96%   0.6   5.3s    50    100   100    96   100   100    98   100   100   100    76    18    44    92   100   100   100   100   100   100    98   100    98    80     2    42    54
+Ministral-3-8B-Instruct-2512-Q8_0 LS/P [reforged]                 84.2%    90.9%    92.7%    91%   0.7   4.6s    50    100   100     6   100   100   100   100   100   100    98     8   100    80   100   100     4   100   100   100   100   100   100    96     0    98   100
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/N [reforged:full]           83.1%    83.1%    99.9%    96%   0.5   6.0s    50    100   100   100   100   100   100    98   100   100    66    20    36    88   100   100   100   100   100   100   100   100    98    74     2    26    52
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged:keep-last]    82.4%    83.0%    99.3%    92%   0.6   4.2s    50    100   100   100   100   100    98   100   100   100    68    16    28    86   100   100   100   100   100    96    98   100    96    64     6    24    62
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged:full]         81.8%    81.8%   100.0%    95%   0.5   4.3s    50    100   100    98   100    98    98    98   100    98    62     8    30    96   100   100    98   100   100    98    96   100    98    62     2    40    46
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/N [reforged]              81.4%    81.6%    99.7%    94%   0.6   3.8s    50    100   100    98   100   100   100   100   100   100    68    24    18    86   100   100    98   100   100   100   100   100    96    70     4    28    26
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged:full]           81.4%    84.8%    96.0%    95%   0.7   4.0s    50    100   100    88   100   100    94   100    96   100    72    12    58    80   100    98    88   100   100    96   100    92   100    64     2    36    40
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged:keep-last]      81.2%    84.6%    96.0%    97%   0.7   4.1s    50    100   100    94   100   100    94   100    92    98    72    12    42    78   100   100    86   100   100    98   100    98   100    64     0    52    32
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged:full]         81.0%    83.1%    97.5%    95%   0.7   3.0s    50    100    98    88   100   100    96   100    98    98    84    24    38    70   100   100   100   100   100    96    98   100    96    72     2    22    26
+Ministral-3-8B-Instruct-2512-Q8_0 LS/N [reforged]                 81.0%    81.0%   100.0%   100%   0.3   4.1s    50    100   100   100   100   100   100   100   100   100    30     0     4   100   100   100   100   100   100   100   100   100   100    68     0     4   100
+Ministral-3-8B-Reasoning-2512-Q8_0 LS/P [reforged]                80.9%    84.8%    95.4%    95%   0.7   3.9s    50    100   100    90   100   100    98   100    98   100    72     6    50    76   100   100    86   100   100   100   100    92    96    64     2    42    32
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged:keep-last]    79.8%    82.6%    96.7%    95%   0.6   2.8s    50    100   100    96   100   100    98    98    94    98    66     8    36    64   100   100    96   100   100   100    98    96    94    76     0    34    24
+Ministral-3-8B-Reasoning-2512-Q4_K_M LS/P [reforged]              79.5%    81.9%    97.0%    94%   0.7   2.8s    50    100   100    98   100   100    96   100    90    96    68     4    32    68   100    98    98   100   100    96    96    94    98    70     0    34    30
+Ministral-3-8B-Instruct-2512-Q4_K_M LS/N [reforged]               78.3%    78.3%   100.0%    95%   0.4   3.2s    50    100   100   100   100   100   100   100   100   100    18     0     0   100   100   100   100   100   100   100   100   100   100    16     2     0   100
+Ministral-3-8B-Instruct-2512-Q4_K_M LS/P [reforged]               74.9%    83.3%    89.9%    79%   1.3   3.1s    50    100   100     0   100   100   100   100   100   100    20     2    42   100   100   100     0   100   100   100   100   100   100    22     0    62   100
+ministral-3:8b-instruct-2512-q8_0 OL/N [reforged:full]²           70.7%    74.5%    94.9%    74%   1.1   5.9s    50    100   100   100   100   100   100   100   100    92    12    42     0    42   100   100   100   100   100   100   100   100    26     4     8     0    12
+ministral-3:8b-instruct-2512-q4_K_M OL/N [reforged:full]²         66.8%    71.9%    92.9%    68%   1.4   5.4s    50    100   100   100   100    76   100   100    68    90     0     0     4    64   100   100   100   100    28   100   100    80    98     0     0     6    22
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## qwen3-14b
+## ministral-14b
 
 ```
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                         Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-qwen3:14b-q4_K_M OL/N [reforged]    78.6%    78.7%    99.9%    77%   1.2  38.5s    50    100   100   100   100   100   100   100   100    74     4    12    68    78   100   100   100   100   100   100   100    94    88     4     0    54    68
-Qwen3-14B-Q4_K_M LS/P [reforged]    70.5%    70.8%    99.7%    86%   0.5  24.2s    50    100   100   100   100   100    94   100    64    68     0     0    32    72   100   100   100   100    98    94   100    58    66     0     0    30    58
-Qwen3-14B-Q4_K_M LS/N [reforged]    67.7%    67.7%    99.9%    85%   0.9  20.8s    50    100   100   100   100   100    94   100    62    36     4    22    44    22   100   100   100   100    98    84   100    66    24    12    18    42    32
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged]              83.3%    83.3%   100.0%    96%   0.6   4.8s    50    100   100   100   100   100   100   100   100    60    32    34    78    94   100   100   100   100   100    92   100   100    62    30    28    78    78
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged:full]         83.2%    83.2%   100.0%    97%   0.6   5.6s    50    100   100   100   100   100    86    98   100    68    40    32    76    96   100   100   100   100   100    92   100   100    62    34    30    68    82
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/N [reforged:keep-last]    82.9%    82.9%   100.0%    95%   0.6   5.0s    50    100   100   100   100   100    92   100   100    68    40    30    60    94   100   100   100   100   100    96   100   100    62    36    20    72    86
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged:full]         81.2%    82.0%    98.9%    98%   0.6   3.8s    50    100   100   100   100   100    74   100   100    68    40     4    72    88   100   100   100   100   100    82   100   100    78    38    10    64    92
+Ministral-3-14B-Instruct-2512-Q4_K_M LS/P [reforged]               80.6%    80.6%   100.0%   100%   0.0   3.0s    50    100   100   100   100   100   100   100   100   100     0     0    46   100   100   100   100   100   100   100   100   100    98     0     0    52   100
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged:keep-last]    78.0%    78.8%    98.9%    95%   0.6   3.8s    50    100   100   100   100   100    82   100   100    70    28     2    52    78   100   100   100   100   100    74   100   100    80    18     4    48    92
+Ministral-3-14B-Instruct-2512-Q4_K_M LS/N [reforged]               77.8%    77.8%   100.0%    97%   0.3   4.0s    50    100   100   100   100   100   100   100   100   100     8     0     0   100   100   100   100   100   100   100   100   100   100    14     0     0   100
+Ministral-3-14B-Reasoning-2512-Q4_K_M LS/P [reforged]              77.7%    78.8%    98.5%    95%   0.6   3.8s    50    100   100   100   100   100    74   100   100    78    32     2    46    90   100   100   100   100   100    66   100   100    74    28     4    48    78
+ministral-3:14b-instruct-2512-q4_K_M OL/N [reforged:full]²         74.8%    74.8%   100.0%    81%   1.0   6.2s    50    100   100   100   100   100   100   100   100    96    56     0     0    80   100   100   100   100   100   100   100   100    98     0     0     0    16
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## gemma4-e4b
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-gemma-4-E4B-it-Q4_K_M LS/N [reforged]    78.2%    82.2%    95.1%    98%   0.5   9.0s    50    100   100   100   100   100    92    98    98    90     0    24    80    50   100   100   100   100   100    94    90    94    98     0     0    84    40
-gemma-4-E4B-it-Q8_0 LS/N [reforged]      76.2%    80.7%    94.5%    98%   0.6  12.8s    50    100   100   100   100   100    84    88    90    96     2    14    80    44   100   100   100   100   100    88    90    96    94     4     0    80    32
-gemma4:e4b-it-q4_K_M OL/N [reforged]     74.8%    75.0%    99.8%    83%   0.8  11.3s    50    100   100   100   100   100    94   100   100    92     0     0    44    66   100   100   100   100   100    90   100   100    78     0     0    40    42
-gemma-4-E4B-it-Q8_0 LS/P [reforged]      74.7%    74.7%   100.0%    85%   0.6  12.7s    50    100   100   100   100   100    70   100    90    88     0    16    34    94   100   100   100   100   100    48   100    98    84     0     0    30    90
-gemma4:e4b-it-q8_0 OL/N [reforged]       73.6%    73.8%    99.8%    85%   0.8  12.8s    50    100   100   100   100   100    78    98   100   100     0     8    34    60   100   100   100   100   100    78    94   100    96     0     0    34    34
-gemma-4-E4B-it-Q4_K_M LS/P [reforged]    72.8%    72.8%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    38    98    94    94     0    18    26    96   100   100   100   100   100    26    98   100    92     0     2    22    90
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+gemma-4-E4B-it-Q4_K_M LS/N [reforged]              79.7%    79.9%    99.8%   100%   0.3   8.1s    50    100   100   100   100   100    94    98   100    84     8    30    64    92   100   100   100   100   100    88    94   100    86     2     0    48    84
+gemma-4-E4B-it-Q4_K_M LS/N [reforged:full]         79.2%    82.2%    96.3%    99%   0.5  10.0s    50    100   100   100   100   100    96    88    94   100     2    40    78    54   100   100   100   100   100    94    84    98    96     0     0    82    52
+gemma-4-E4B-it-Q4_K_M LS/N [reforged:keep-last]    78.7%    81.4%    96.7%    99%   0.5   9.3s    50    100   100   100   100   100    96    92    96    96     6    20    64    62   100   100   100   100   100    88    88   100    98     2     0    82    56
+gemma-4-E4B-it-Q8_0 LS/N [reforged]                77.8%    77.9%    99.8%   100%   0.2  10.8s    50    100   100   100   100   100    76    90   100   100     0    18    38    98   100   100   100   100   100    76    98   100    98     0     0    36    94
+gemma-4-E4B-it-Q8_0 LS/N [reforged:keep-last]      75.7%    79.3%    95.5%    99%   0.5  13.1s    50    100   100   100   100   100    76    92    98    96     4    16    82    48   100   100   100   100   100    50    86    98    94     2     0    86    40
+gemma-4-E4B-it-Q8_0 LS/N [reforged:full]           75.6%    80.8%    93.6%    98%   0.6  12.5s    50    100   100   100   100   100    92    94    92    88     0    18    82    28   100   100   100   100   100    76    92    94    98     2     0    84    26
+gemma4:e4b-it-q4_K_M OL/N [reforged:full]²         74.8%    75.0%    99.8%    83%   0.8  11.3s    50    100   100   100   100   100    94   100   100    92     0     0    44    66   100   100   100   100   100    90   100   100    78     0     0    40    42
+gemma-4-E4B-it-Q8_0 LS/P [reforged:keep-last]      74.1%    74.2%    99.8%    85%   0.6  13.4s    50    100   100   100   100   100    48   100    96    84     0    22    28    94   100   100   100   100   100    52   100    98    90     0     0    18    96
+gemma-4-E4B-it-Q8_0 LS/P [reforged:full]           73.7%    73.7%   100.0%    86%   0.6  12.7s    50    100   100   100   100   100    48   100    92    90     0    36    28    88   100   100   100   100   100    48    94    98    82     0     0    20    92
+gemma4:e4b-it-q8_0 OL/N [reforged:full]²           73.6%    73.8%    99.8%    85%   0.8  12.8s    50    100   100   100   100   100    78    98   100   100     0     8    34    60   100   100   100   100   100    78    94   100    96     0     0    34    34
+gemma-4-E4B-it-Q8_0 LS/P [reforged]                73.2%    73.3%    99.8%    85%   0.6  13.3s    50    100   100   100   100   100    54    98    94    80     0    28    20    94   100   100   100   100   100    40   100    98    90     0     0    12    96
+gemma-4-E4B-it-Q4_K_M LS/P [reforged:keep-last]    73.2%    73.2%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    40    98    96    92     0    10    24    96   100   100   100   100   100    38   100   100    86     0     0    32    92
+gemma-4-E4B-it-Q4_K_M LS/P [reforged:full]         72.9%    72.9%   100.0%    85%   0.6   8.5s    50    100   100   100   100   100    54    98    98    86     0    10    28    88   100   100   100   100   100    26    98    96    94     0     0    30    90
+gemma-4-E4B-it-Q4_K_M LS/P [reforged]              72.4%    72.4%    99.9%    85%   0.6   8.9s    50    100   100   100   100   100    56    96   100    86     0    14    28    84   100   100   100   100   100    34   100    96    74     0     0    22    92
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## mistral-small-3.2
+## qwen3-14b
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [reforged]    78.2%    84.3%    92.7%    78%   1.1   3.6s    50    100   100   100   100   100   100   100   100    28     0     0   100    94   100   100   100   100   100   100   100   100    34     0     0   100    76
-Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/N [reforged]    71.0%    71.2%    99.8%    96%   0.5   6.5s    50    100   100   100   100    98    58   100   100    68     4    12    20    90   100   100   100   100   100    50   100   100     4    22     0    20   100
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                   Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+qwen3:14b-q4_K_M OL/N [reforged:full]²        78.6%    78.7%    99.9%    77%   1.2  38.5s    50    100   100   100   100   100   100   100   100    74     4    12    68    78   100   100   100   100   100   100   100    94    88     4     0    54    68
+Qwen3-14B-Q4_K_M LS/P [reforged:keep-last]    71.8%    71.8%   100.0%    86%   0.5  23.8s    50    100   100   100   100   100    98   100    72    58     2     4    28    72   100   100   100   100   100    94   100    76    74     0     0    32    56
+Qwen3-14B-Q4_K_M LS/P [reforged:full]         71.8%    71.9%    99.8%    87%   0.5  24.3s    50    100   100   100   100   100    96   100    72    72     2     0    30    74   100   100   100   100   100    92   100    74    68     0     0    38    48
+Qwen3-14B-Q4_K_M LS/P [reforged]              71.4%    71.4%    99.9%    86%   0.5  25.6s    50    100   100   100   100   100    98   100    70    70     0     0    30    80   100   100   100   100   100    92   100    76    56     0     0    22    62
+Qwen3-14B-Q4_K_M LS/N [reforged]              68.5%    68.5%   100.0%   100%   0.4  21.8s    50    100   100   100   100    98   100   100    56    72    14     4    58     4   100   100   100   100   100   100   100    42    72    10     0    52     0
+Qwen3-14B-Q4_K_M LS/N [reforged:full]         68.4%    68.4%    99.9%    83%   0.9  21.9s    50    100   100   100   100    98    90    98    60    32    20    18    50    38   100   100   100   100   100    86   100    74    34     6    18    34    22
+Qwen3-14B-Q4_K_M LS/N [reforged:keep-last]    64.0%    64.0%    99.9%    91%   0.6  20.3s    50    100   100   100   100   100    90    98    48    40    18     6    38     4   100   100   100    98    96    88   100    54    30    16     2    38     0
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
-## qwen3-8b
+## mistral-small-3.2
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Qwen3-8B-Q8_0 LS/P [reforged]      73.1%    73.2%    99.8%    89%   0.4  28.4s    50    100   100   100   100   100   100   100    58    96     0     8    28    94   100   100   100   100    96   100    98    64    88     0     0    12    58
-Qwen3-8B-Q4_K_M LS/P [reforged]    70.4%    70.7%    99.6%    86%   0.5  17.8s    50    100   100   100   100   100    94   100    56    64     0    14    12    92   100   100   100   100   100    94   100    62    58     0     0     6    78
-Qwen3-8B-Q8_0 LS/N [reforged]      70.3%    70.5%    99.7%    88%   0.6  24.1s    50    100   100   100   100   100   100   100    60    82     4    22    20    32   100   100   100   100    98    94   100    58    66     2    12    28    50
-Qwen3-8B-Q4_K_M LS/N [reforged]    68.2%    68.4%    99.6%    86%   0.7  16.1s    50     98   100   100   100   100    92   100    48    78     0    44     8    38   100   100   100   100   100    90   100    40    76     0     8    14    38
-qwen3:8b-q8_0 OL/N [reforged]      67.5%    67.6%    99.9%    85%   0.6  31.0s    50    100   100   100   100   100   100   100    26    88     0     2     6    66   100   100   100   100   100   100   100    40    82     0     0     2    44
-qwen3:8b-q4_K_M OL/N [reforged]    64.9%    65.1%    99.8%    85%   0.6  21.0s    50    100   100   100   100   100    96    98    30    62     2     6     2    74    98   100   100   100   100    98   100    26    70     4     0     4    18
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                                         Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/P [reforged:full]²    78.2%    84.3%    92.7%    78%   1.1   3.6s    50    100   100   100   100   100   100   100   100    28     0     0   100    94   100   100   100   100   100   100   100   100    34     0     0   100    76
+Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M LS/N [reforged:full]²    71.0%    71.2%    99.8%    96%   0.5   6.5s    50    100   100   100   100    98    58   100   100    68     4    12    20    90   100   100   100   100   100    50   100   100     4    22     0    20   100
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## phi-4
@@ -148,41 +158,68 @@ qwen3:8b-q4_K_M OL/N [reforged]    64.9%    65.1%    99.8%    85%   0.6  21.0s
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Model/Backend                     Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-phi-4-Q4_K_M LS/P [reforged]    72.9%    73.3%    99.5%    85%   0.9   4.1s    50    100   100   100   100   100    34    56    96    90    52    24    38    70   100   100   100   100   100    24    62    92    98    52     0    60    48
+phi-4-Q4_K_M LS/P [reforged]    75.3%    75.4%    99.8%    83%   0.9   4.2s    50    100   100   100   100   100    26    62    94    96    62    34    66    70   100   100   100   100   100    28    84    98    94    42     0    60    42
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
+## qwen3-8b
+
+```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                  Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Qwen3-8B-Q8_0 LS/P [reforged:keep-last]      72.8%    73.0%    99.7%    89%   0.4  28.0s    50    100   100   100   100   100    98    98    80    90     0     6     8    94   100   100   100   100    96    96   100    60    98     0     2    12    54
+Qwen3-8B-Q8_0 LS/P [reforged:full]           72.8%    72.9%    99.8%    88%   0.4  28.9s    50    100   100   100   100   100    98   100    70    90     0     4    20    96   100   100   100   100    92   100    96    66    92     0     0    12    56
+Qwen3-8B-Q4_K_M LS/P [reforged:keep-last]    72.2%    72.3%    99.9%    88%   0.5  17.9s    50    100   100   100   100   100   100   100    62    66     0    30     8    90   100   100   100   100   100    98   100    74    68     0     0     8    74
+Qwen3-8B-Q8_0 LS/P [reforged]                72.0%    72.3%    99.6%    88%   0.4  28.6s    50    100   100   100   100   100    96   100    56    90     0     2    10    96   100   100   100   100    96    98   100    58    88     0     0    20    62
+Qwen3-8B-Q4_K_M LS/P [reforged]              71.1%    71.2%    99.8%    87%   0.5  18.0s    50    100   100   100   100   100    96   100    70    66     0    18     8    96   100   100   100   100    98    96   100    60    60     0     0    10    70
+Qwen3-8B-Q4_K_M LS/P [reforged:full]         70.5%    70.8%    99.6%    88%   0.4  17.4s    50    100   100   100   100   100    88   100    58    66     0    24    10    88   100   100   100   100   100    94    98    62    66     0     0     4    76
+Qwen3-8B-Q8_0 LS/N [reforged:full]           69.3%    69.6%    99.6%    88%   0.6  24.7s    50     98   100   100   100   100    94   100    48    76     2    28    24    46   100   100   100   100   100    98   100    46    76     2     8    16    40
+Qwen3-8B-Q8_0 LS/N [reforged]                68.2%    68.5%    99.5%    95%   0.3  24.8s    50    100   100   100   100   100   100   100    56    78     6    24    28     6    98   100   100   100   100   100   100    52    84     6     2    30     2
+qwen3:8b-q8_0 OL/N [reforged:full]²          67.5%    67.6%    99.9%    85%   0.6  31.0s    50    100   100   100   100   100   100   100    26    88     0     2     6    66   100   100   100   100   100   100   100    40    82     0     0     2    44
+Qwen3-8B-Q4_K_M LS/N [reforged]              67.3%    67.5%    99.7%    96%   0.3  15.6s    50    100   100   100   100   100   100   100    40    98     6    14    22     2   100   100   100   100   100   100   100    48    86     2     0    26     6
+Qwen3-8B-Q8_0 LS/N [reforged:keep-last]      67.0%    67.2%    99.8%    92%   0.4  23.2s    50    100   100   100   100   100    94    98    48    84     2    22    10    12   100   100   100   100   100    96   100    52    80     0    18    20     6
+Qwen3-8B-Q4_K_M LS/N [reforged:full]         65.8%    66.0%    99.7%    84%   0.7  17.2s    50    100   100   100   100   100    94   100    34    66     0    18    10    38   100   100   100    96   100    86   100    34    74     0    10    12    40
+qwen3:8b-q4_K_M OL/N [reforged:full]²        64.9%    65.1%    99.8%    85%   0.6  21.0s    50    100   100   100   100   100    96    98    30    62     2     6     2    74    98   100   100   100   100    98   100    26    70     4     0     4    18
+Qwen3-8B-Q4_K_M LS/N [reforged:keep-last]    64.5%    64.6%    99.9%    91%   0.4  15.0s    50    100   100   100   100   100   100   100    30    82     0    28    12    10   100   100   100    98   100    96   100    22    86     2     2     6     4
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
 ## nemotron-3-nano
 
 ```
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                                       Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Nemotron-3-Nano-30B-A3B-Q4_K_M LS/N [reforged]    71.3%    81.0%    88.0%    72%   1.5  21.4s    50    100   100   100   100   100    66    98    52    92    28     4    34    34   100   100   100    98   100    86    92    68    98    24     8    34    38
-Nemotron-3-Nano-30B-A3B-Q4_K_M LS/P [reforged]    70.2%    70.7%    99.4%    89%   0.4  10.8s    50    100   100   100   100    98    52   100    84    90     6     4     0   100   100   100   100   100   100    42   100    80    92     6     2     4    66
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                             Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Nemotron-3-Nano-30B-A3B-Q4_K_M LS/N [reforged:full]²    71.3%    81.0%    88.0%    72%   1.5  21.4s    50    100   100   100   100   100    66    98    52    92    28     4    34    34   100   100   100    98   100    86    92    68    98    24     8    34    38
+Nemotron-3-Nano-30B-A3B-Q4_K_M LS/P [reforged:full]²    70.2%    70.7%    99.4%    89%   0.4  10.8s    50    100   100   100   100    98    52   100    84    90     6     4     0   100   100   100   100   100   100    42   100    80    92     6     2     4    66
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 ## granite-4.1-8b
 
 ```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Model/Backend                              Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-granite4.1:8b-q8_0 OL/N [reforged]       69.2%    69.2%   100.0%    83%   1.1   2.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100   100   100     0     0     0     0
-granite-4.1-8b-Q4_K_M LS/N [reforged]    65.4%    68.0%    96.2%    90%   0.8   1.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
-granite-4.1-8b-Q8_0 LS/N [reforged]      65.4%    65.4%   100.0%    88%   1.4   2.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
-granite-4.1-8b-Q4_K_M LS/P [reforged]    61.5%    61.5%   100.0%    90%   0.3   2.5s    50    100   100   100   100   100     0   100   100     0     0   100     0     0   100   100   100   100   100     0   100   100     0     0   100     0     0
-granite-4.1-8b-Q8_0 LS/P [reforged]      61.5%    66.7%    92.3%    73%   1.0   5.2s    50      0   100   100   100   100   100   100   100     0     0     0   100     0     0   100   100   100   100   100   100   100     0     0     0   100     0
-granite4.1:8b-q4_K_M OL/N [reforged]     57.8%    57.8%   100.0%    81%   1.3   1.9s    50    100   100   100   100   100   100   100   100     2     0     0     0     0   100   100   100   100   100   100   100     0     2     0     0     0     0
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+Model/Backend                                        Scr      Acc      Cmp    Eff   Wst    Spd     N    rel   arg   tsl   b2s   s3s   crt   srn   err   dgr   dge   art   grs   iar rel_s arg_s tsl_s b2s_s s3s_s crt_s srn_s err_s dgr_s dge_s art_s grs_s iar_s
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+granite4.1:8b-q8_0 OL/N [reforged:full]²           69.2%    69.2%   100.0%    83%   1.1   2.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100   100   100     0     0     0     0
+granite-4.1-8b-Q8_0 LS/N [reforged]                65.4%    65.4%   100.0%    88%   1.3   2.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [reforged]              65.4%    68.0%    96.2%    90%   0.8   1.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [reforged:keep-last]    65.4%    68.0%    96.2%    90%   0.8   1.8s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q4_K_M LS/N [reforged:full]         65.4%    68.0%    96.2%    90%   0.8   1.9s    50    100   100   100   100   100   100   100   100   100     0     0     0     0   100   100   100   100   100   100   100     0   100     0     0     0     0
+granite-4.1-8b-Q8_0 LS/P [reforged]                61.5%    66.7%    92.3%    73%   1.0   5.2s    50      0   100   100   100   100   100   100   100     0     0     0   100     0     0   100   100   100   100   100   100   100     0     0     0   100     0
+granite-4.1-8b-Q4_K_M LS/P [reforged]              61.5%    61.5%   100.0%    90%   0.3   2.5s    50    100   100   100   100   100     0   100   100     0     0   100     0     0   100   100   100   100   100     0   100   100     0     0   100     0     0
+granite4.1:8b-q4_K_M OL/N [reforged:full]²         57.8%    57.8%   100.0%    81%   1.3   1.9s    50    100   100   100   100   100   100   100   100     2     0     0     0     0   100   100   100   100   100   100   100     0     2     0     0     0     0
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 ```
 
 Scr=score(correct/total), Acc=accuracy(correct/total, excl validate errors), Cmp=completeness(completed/total), Eff=efficiency(ideal/actual calls), Wst=avg wasted calls, Spd=avg time(excl compaction)
 rel=relevance_detection, arg=argument_fidelity, tsl=tool_selection, b2s=basic_2step, s3s=sequential_3step, crt=conditional_routing, srn=sequential_reasoning, err=error_recovery, dgr=data_gap_recovery, dge=data_gap_recovery_extended, art=argument_transformation, grs=grounded_synthesis, iar=inconsistent_api_recovery, rel_s=relevance_detection_stateful, arg_s=argument_fidelity_stateful, tsl_s=tool_selection_stateful, b2s_s=basic_2step_stateful, s3s_s=sequential_3step_stateful, crt_s=conditional_routing_stateful, srn_s=sequential_reasoning_stateful, err_s=error_recovery_stateful, dgr_s=data_gap_recovery_stateful, dge_s=data_gap_recovery_extended_stateful, art_s=argument_transformation_stateful, grs_s=grounded_synthesis_stateful, iar_s=inconsistent_api_recovery_stateful
 Ablation: full=all guardrails, no_rescue=no rescue loop, no_nudge=no rescue/retry nudge, no_steps=no step enforcement, no_recovery=no error recovery, no_compact=no compaction, bare=all guardrails off
+Replay: ':keep-last'/':full' tags = reasoning_replay policy (how much captured reasoning is re-sent to the backend each turn); untagged = none (default). Rows predating the knob ran unbounded replay and count as full.
 
 Eval generations (older runs carried forward, superscript-tagged):
   ¹ gen 1 — v0.6.0 suite — incl. Anthropic ablation (commit 2b05dc4, 2026-05-08)
+  ² gen 2 — v0.7.0 lineup refresh (8–14B) + 32GB tier debut (v0.7.4) (commit 655e1f6, 2026-05-22)
 
-*Generated 2026-06-03 00:09*
+*Generated 2026-06-11 20:28*
diff --git a/tests/eval/dashboard/src/Sidebar.tsx b/tests/eval/dashboard/src/Sidebar.tsx
index 573a6bd..35ef2bc 100644
--- a/tests/eval/dashboard/src/Sidebar.tsx
+++ b/tests/eval/dashboard/src/Sidebar.tsx
@@ -1,5 +1,6 @@
 import type { ConfigRow, FilterDimension, Filters, ScenarioScope, ScreenId, SuiteScope, ViewId } from "./types";
 import { FILTER_DIMENSIONS, SCENARIO_SCOPES, SUITE_SCOPES } from "./types";
+import { replayRank } from "./utils";
 import { ScreenSelector } from "./ScreenSelector";
 import { ViewSelector } from "./ViewSelector";
 
@@ -8,6 +9,7 @@ const DIMENSION_LABELS: Record<FilterDimension, string> = {
   mode: "Mode",
   family: "Family",
   quant: "Quant",
+  replay: "Reasoning Replay",
 };
 
 interface SidebarProps {
@@ -108,7 +110,11 @@ export function Sidebar({
       )}
 
       {FILTER_DIMENSIONS.map((dim) => {
-        const vals = [...new Set(rows.map((r) => r[dim]))].sort();
+        const vals = [...new Set(rows.map((r) => r[dim]))].sort(
+          dim === "replay"
+            ? (a, b) => replayRank(a) - replayRank(b)
+            : undefined,
+        );
         if (vals.length < 2) return null;
 
         return (
diff --git a/tests/eval/dashboard/src/types.ts b/tests/eval/dashboard/src/types.ts
index d52ef86..19e2715 100644
--- a/tests/eval/dashboard/src/types.ts
+++ b/tests/eval/dashboard/src/types.ts
@@ -5,6 +5,8 @@ export interface ConfigRow {
   backend: string;
   mode: string;
   ablation: string;
+  /** reasoning_replay policy ("none" | "keep-last" | "full"); pre-knob rows count as "full". */
+  replay: string;
   family: string;
   quant: string;
   /** Eval generation this row's data came from (see report.py dedup_latest_gen). */
@@ -72,7 +74,7 @@ export const SUITE_SCOPES: { id: SuiteScope; label: string }[] = [
   { id: "advanced_reasoning", label: "Advanced Reasoning" },
 ];
 
-export const FILTER_DIMENSIONS = ["backend", "mode", "family", "quant"] as const;
+export const FILTER_DIMENSIONS = ["backend", "mode", "family", "quant", "replay"] as const;
 export type FilterDimension = (typeof FILTER_DIMENSIONS)[number];
 
 export type Filters = Record<FilterDimension, Set<string>>;
@@ -102,6 +104,10 @@ export const ABLATION_ORDER: readonly string[] = [
   "no_compact",
 ];
 
+/** Intra-group ordering for reasoning_replay rows: default policy first,
+ * then increasing replay volume. Mirrors _REPLAY_ORDER in report.py. */
+export const REPLAY_ORDER: readonly string[] = ["none", "keep-last", "full"];
+
 /** Pre-baked view definitions — control grouping within the active screen's row set. */
 export type ViewId = "all" | "by-backend" | "by-family";
 
@@ -119,7 +125,7 @@ export const VIEWS: ViewDef[] = [
   {
     id: "by-backend",
     label: "By Backend",
-    groupBy: ["model", "quant", "ablation"],
+    groupBy: ["model", "quant", "ablation", "replay"],
     intraSort: "backend",
   },
   {
diff --git a/tests/eval/dashboard/src/utils.ts b/tests/eval/dashboard/src/utils.ts
index 1481425..eca06f1 100644
--- a/tests/eval/dashboard/src/utils.ts
+++ b/tests/eval/dashboard/src/utils.ts
@@ -1,5 +1,5 @@
 import type { ConfigRow, ScenarioScope, ScreenId, SortState, SuiteScope, ViewDef } from "./types";
-import { ABLATION_ORDER } from "./types";
+import { ABLATION_ORDER, REPLAY_ORDER } from "./types";
 
 /** Filter rows according to the active screen.
  *
@@ -34,6 +34,12 @@ function ablationRank(name: string): number {
   return idx === -1 ? ABLATION_ORDER.length : idx;
 }
 
+/** Rank for sorting reasoning_replay rows in canonical order; unknowns land last. */
+export function replayRank(policy: string): number {
+  const idx = REPLAY_ORDER.indexOf(policy);
+  return idx === -1 ? REPLAY_ORDER.length : idx;
+}
+
 /** Heat-map color class based on percentage value. */
 export function heatClass(v: number | null): string {
   if (v == null) return "";
@@ -253,6 +259,8 @@ export function groupRows(
       if (byAblationRank) {
         const diff = ablationRank(a.ablation) - ablationRank(b.ablation);
         if (diff !== 0) return diff;
+        const rDiff = replayRank(a.replay) - replayRank(b.replay);
+        if (rDiff !== 0) return rDiff;
         return b.score - a.score;
       }
       const scoreDiff = b.score - a.score;
diff --git a/tests/eval/report.py b/tests/eval/report.py
index 9b061a2..5e09604 100644
--- a/tests/eval/report.py
+++ b/tests/eval/report.py
@@ -116,13 +116,23 @@ class ConfigKey:
     mode: str
     ablation: str = "reforged"
     tool_choice: str = "auto"
+    # Pre-knob rows ran unbounded replay — legacy behavior == "full".
+    reasoning_replay: str = "full"
 
     @property
     def _tag(self) -> str:
-        """Ablation + tool_choice tag, e.g. '[full]', '[bare]', '[bare+any]'."""
+        """Ablation + tool_choice + replay tag, e.g. '[bare]', '[bare+any]', '[reforged:full]'.
+
+        The replay policy is tagged only when it differs from the default
+        ("none"), so default-policy rows keep the familiar clean label.
+        """
         if self.ablation != "reforged" and self.tool_choice != "auto":
-            return f"[{self.ablation}+{self.tool_choice}]"
-        return f"[{self.ablation}]"
+            base = f"{self.ablation}+{self.tool_choice}"
+        else:
+            base = self.ablation
+        if self.reasoning_replay != "none":
+            base = f"{base}:{self.reasoning_replay}"
+        return f"[{base}]"
 
     @property
     def label(self) -> str:
@@ -144,7 +154,10 @@ def short_label(self) -> str:
         return f"{m} {b}/{mode_char} {self._tag}"
 
     def __hash__(self) -> int:
-        return hash((self.model, self.backend, self.mode, self.ablation, self.tool_choice))
+        return hash((
+            self.model, self.backend, self.mode,
+            self.ablation, self.tool_choice, self.reasoning_replay,
+        ))
 
     def __eq__(self, other: object) -> bool:
         if not isinstance(other, ConfigKey):
@@ -155,6 +168,7 @@ def __eq__(self, other: object) -> bool:
             and self.mode == other.mode
             and self.ablation == other.ablation
             and self.tool_choice == other.tool_choice
+            and self.reasoning_replay == other.reasoning_replay
         )
 
 
@@ -172,8 +186,26 @@ def load_jsonl(path: Path) -> list[dict]:
     return rows
 
 
+def _row_replay(row: dict) -> str:
+    """Row-level reasoning_replay, defaulting pre-knob rows to "full".
+
+    Pre-knob rows (no field) ran unbounded replay, which the knob names
+    "full" — so carried-forward older generations surface with an honest
+    ':full' tag rather than masquerading as the current default.
+    """
+    return row.get("reasoning_replay", "full")
+
+
 def _config_tuple(row: dict) -> tuple[str, str, str, str, str]:
-    """The identity a config is deduped on — mirrors ConfigKey's fields."""
+    """The identity a config is deduped on — ConfigKey's fields minus reasoning_replay.
+
+    reasoning_replay is deliberately NOT part of the dedup identity: pre-knob
+    rows have no field, and a newer-gen re-sweep should supersede them
+    regardless of which policies it ran (else every v0.7.0 row would survive
+    as a stale ':full' duplicate next to its re-swept config). Within one
+    generation all policy rows share the gen, so none/keep-last/full survive
+    dedup side by side as separate display rows (see group_rows).
+    """
     return (
         row["model"],
         row["backend"],
@@ -220,7 +252,10 @@ def group_rows(
     for row in rows:
         ablation = row.get("ablation", "reforged")
         tc = row.get("tool_choice", "auto")
-        key = ConfigKey(row["model"], row["backend"], row["mode"], ablation, tc)
+        key = ConfigKey(
+            row["model"], row["backend"], row["mode"], ablation, tc,
+            _row_replay(row),
+        )
         grouped[key][row["scenario"]].append(row)
     return grouped
 
@@ -662,6 +697,10 @@ def extract_quant(model: str) -> str:
         "note": "v0.6.0 suite — incl. Anthropic ablation"},
     2: {"commit": "655e1f6", "date": "2026-05-22",
         "note": "v0.7.0 lineup refresh (8–14B) + 32GB tier debut (v0.7.4)"},
+    # Tag ref, not a commit SHA: gen 3 landed via a branch whose squash-merge
+    # SHA didn't exist when this entry was written; the v0.7.5 tag resolves to it.
+    3: {"commit": "v0.7.5", "date": "2026-06-11",
+        "note": "reasoning-replay grid (8–14B × none/keep-last/full) + Claude thinking-on baseline"},
 }
 
 # Coarse families in the "Retired" tier of docs/MODEL_REGISTRY.md. Retired
@@ -739,6 +778,11 @@ def _legend_lines(scenarios: list[str]) -> list[str]:
         "no_steps=no step enforcement, no_recovery=no error recovery, no_compact=no compaction, "
         "bare=all guardrails off"
     )
+    lines.append(
+        "Replay: ':keep-last'/':full' tags = reasoning_replay policy (how much captured "
+        "reasoning is re-sent to the backend each turn); untagged = none (default). "
+        "Rows predating the knob ran unbounded replay and count as full."
+    )
     return lines
 
 
@@ -942,6 +986,7 @@ def _metrics_to_json_row(m: ConfigMetrics, scenarios: list[str]) -> dict:
         "backend": m.key.backend,
         "mode": m.key.mode,
         "ablation": m.key.ablation,
+        "replay": m.key.reasoning_replay,
         "family": extract_family(m.key.model),
         "quant": extract_quant(m.key.model),
         "gen": m.gen,
@@ -1069,6 +1114,19 @@ def _ablation_rank(name: str) -> int:
         return len(_ABLATION_ORDER)
 
 
+# Ordering for reasoning_replay rows within a group: default policy first,
+# then increasing replay volume. Mirrors REPLAY_ORDER in the dashboard's types.ts.
+_REPLAY_ORDER = ("none", "keep-last", "full")
+
+
+def _replay_rank(policy: str) -> int:
+    """Rank for sorting reasoning_replay rows; unknowns land last."""
+    try:
+        return _REPLAY_ORDER.index(policy)
+    except ValueError:
+        return len(_REPLAY_ORDER)
+
+
 def write_markdown_views(
     all_metrics: list[ConfigMetrics],
     scenarios: list[str],
@@ -1087,6 +1145,7 @@ def write_markdown_views(
         reforged-vs-bare.md — per-(model,backend,mode) reforged+bare pair
         ablation.md         — deep-ablation configs only, 7-row tower per config
         native-vs-prompt.md — llama-server paired native vs prompt (reforged)
+        reasoning-replay.md — replay policy comparison per config (>1 policy)
         budget.md           — compaction scenarios only (reforged)
     """
     import datetime
@@ -1195,7 +1254,10 @@ def _grouped_view(
         "Forge Eval — Reforged vs Bare",
         "Forge lift: reforged vs bare for each (model, backend, mode)",
         [
-            (f"{model} ({backend}/{mode})", sorted(group, key=lambda m: _ablation_rank(m.key.ablation)))
+            (
+                f"{model} ({backend}/{mode})",
+                sorted(group, key=lambda m: (_ablation_rank(m.key.ablation), _replay_rank(m.key.reasoning_replay))),
+            )
             for (model, backend, mode), group in sorted_rb
         ],
     )
@@ -1216,7 +1278,10 @@ def _grouped_view(
         "Forge Eval — Full Ablation",
         "Per-guardrail ablation: each config shows all ablation variants",
         [
-            (f"{model} ({backend}/{mode})", sorted(group, key=lambda m: _ablation_rank(m.key.ablation)))
+            (
+                f"{model} ({backend}/{mode})",
+                sorted(group, key=lambda m: (_ablation_rank(m.key.ablation), _replay_rank(m.key.reasoning_replay))),
+            )
             for (model, backend, mode), group in sorted_abl
         ],
     )
@@ -1233,11 +1298,33 @@ def _grouped_view(
         "Forge Eval — Native vs Prompt (llama-server)",
         "llama-server native FC vs prompt-injected, reforged only",
         [
-            (model, sorted(group, key=lambda m: m.key.mode))
+            (model, sorted(group, key=lambda m: (m.key.mode, _replay_rank(m.key.reasoning_replay))))
             for model, group in sorted(ls_paired.items())
         ],
     )
 
+    # ── Orthogonal: reasoning-replay.md ───────────────────────────
+    # Policy comparison: same config (model, backend, mode, ablation), one row
+    # per reasoning_replay policy. Only configs that ran >1 policy appear.
+    rr_groups: dict[tuple[str, str, str, str], list[ConfigMetrics]] = defaultdict(list)
+    for m in complete:
+        if m.key.ablation in ("reforged", "bare"):
+            rr_groups[(m.key.model, m.key.backend, m.key.mode, m.key.ablation)].append(m)
+    rr_multi = {k: v for k, v in rr_groups.items() if len({m.key.reasoning_replay for m in v}) > 1}
+    sorted_rr = sorted(rr_multi.items(), key=lambda kv: max(m.score for m in kv[1]), reverse=True)
+    _grouped_view(
+        "reasoning-replay.md",
+        "Forge Eval — Reasoning Replay Policies",
+        "reasoning_replay policy comparison (none / keep-last / full) per config",
+        [
+            (
+                f"{model} ({backend}/{mode}) [{ablation}]",
+                sorted(group, key=lambda m: _replay_rank(m.key.reasoning_replay)),
+            )
+            for (model, backend, mode, ablation), group in sorted_rr
+        ],
+    )
+
     # ── Orthogonal: budget.md ─────────────────────────────────────
     compaction_scenarios = [sc for sc in scenarios if sc in {
         "compaction_stress", "phase2_compaction",
@@ -1261,7 +1348,7 @@ def _grouped_view(
             ("## Reforged — which model should I run?", lambda rp: rp.startswith("reforged/")),
             ("## Reforged vs Bare — how much does forge lift a model?", lambda rp: rp == "reforged-vs-bare.md"),
             ("## Full Ablation — which guardrails do the work?", lambda rp: rp == "ablation.md"),
-            ("## Other cross-cuts", lambda rp: rp in ("native-vs-prompt.md", "budget.md")),
+            ("## Other cross-cuts", lambda rp: rp in ("native-vs-prompt.md", "reasoning-replay.md", "budget.md")),
         ]
         index_lines = [
             "# Forge Eval Reports\n",
@@ -1308,6 +1395,11 @@ def main() -> None:
         help="Filter to specific ablation preset(s) (e.g. --ablation reforged bare). "
         "Default: show all.",
     )
+    parser.add_argument(
+        "--reasoning-replay", nargs="*",
+        help="Filter to specific reasoning_replay polic(ies) (e.g. --reasoning-replay none). "
+        "Rows predating the knob count as 'full'. Default: show all.",
+    )
     parser.add_argument(
         "--exclude-scenario", nargs="*", metavar="NAME",
         help="Exclude scenario(s) from aggregates and columns "
@@ -1357,6 +1449,14 @@ def main() -> None:
             print(f"No data for ablation preset(s): {', '.join(args.ablation)}")
             sys.exit(0)
 
+    # Filter by reasoning_replay policy if requested
+    if args.reasoning_replay:
+        rr_set = set(args.reasoning_replay)
+        rows = [r for r in rows if _row_replay(r) in rr_set]
+        if not rows:
+            print(f"No data for reasoning_replay polic(ies): {', '.join(args.reasoning_replay)}")
+            sys.exit(0)
+
     # Filter rows by scenario tag before detection
     if args.tags:
         _TAG_FILTERS = {

From 330d57feb371224a80e92fa797c8e5f01f55bf47 Mon Sep 17 00:00:00 2001
From: Antoine Zambelli <antoine.zambelli@gmail.com>
Date: Thu, 11 Jun 2026 20:40:21 -0500
Subject: [PATCH 13/14] docs: reasoning_replay knob + ADR-017 + model-registry
 updates

Document the knob and the new none default across README, User Guide,
and Backend Setup, with links to the eval evidence. ADR-017 records the
policy design, the grid results behind the default, and the alternatives
considered. Model Registry: Claude footnote updated for the v0.7.5
thinking-on re-baseline (Sonnet 4.6 / Opus 4.8; Opus 4.6 and the
deep-ablation rows stay carried forward), and Qwen3 8B Q8_0 is flagged
for future retirement on compute-cost vs signal-value grounds (~23% of
the full sweep for a small Q4/Q8 delta).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---
 README.md                                     |  2 +-
 docs/BACKEND_SETUP.md                         |  2 +-
 docs/MODEL_REGISTRY.md                        |  9 ++--
 docs/USER_GUIDE.md                            |  4 +-
 docs/decisions/017-reasoning-replay-policy.md | 51 +++++++++++++++++++
 5 files changed, 60 insertions(+), 8 deletions(-)
 create mode 100644 docs/decisions/017-reasoning-replay-policy.md

diff --git a/README.md b/README.md
index e1b827a..0696369 100644
--- a/README.md
+++ b/README.md
@@ -128,7 +128,7 @@ For multi-step workflows, multi-turn conversations, and backend auto-management,
 
 Drop-in proxy that sits between any client and a local model server, speaking both the OpenAI chat-completions API and the Anthropic Messages API (`/v1/messages`). Point your client at the proxy (e.g. `http://localhost:8081/v1`) and forge applies its guardrails transparently — the client thinks it's talking to a smarter model.
 
-This is the path for **using forge with an existing harness** (opencode, Continue, aider, Cline, anything that speaks the OpenAI chat-completions schema — or Claude Code, which speaks the Anthropic Messages API). No Python rewrite. Reasoning replay defaults to `keep-last`, so Forge captures reasoning for observability and replays only the latest available reasoning block to the backend on later turns; use `--reasoning-replay full` for the historical replay-all behavior or `--reasoning-replay none` to keep captured reasoning out of backend-facing history.
+This is the path for **using forge with an existing harness** (opencode, Continue, aider, Cline, anything that speaks the OpenAI chat-completions schema — or Claude Code, which speaks the Anthropic Messages API). No Python rewrite. Reasoning replay defaults to `none`: Forge still captures reasoning for observability, but keeps it out of backend-facing history on later turns — the most token-efficient policy, and statistically indistinguishable from replay-all on the eval suite (see [reasoning-replay results](docs/results/raw/reasoning-replay.md)). Use `--reasoning-replay keep-last` to replay only the latest reasoning block, or `--reasoning-replay full` for the historical replay-all behavior.
 
 ```bash
 # External mode — you manage the backend, forge proxies it
diff --git a/docs/BACKEND_SETUP.md b/docs/BACKEND_SETUP.md
index 26702d3..2c75fe3 100644
--- a/docs/BACKEND_SETUP.md
+++ b/docs/BACKEND_SETUP.md
@@ -75,7 +75,7 @@ llamafile --server --nobrowser -m path/to/model.gguf --port 8080 -ngl 999
 
 `LlamafileClient` is **native-first**: `mode="native"` (the default) forwards tools via the backend's `tools` parameter and requires native function calling (llama.cpp with `--jinja`). For a backend without native FC, declare `mode="prompt"` to inject tool descriptions into the prompt and parse the JSON call back out. The capability is declared at construction and frozen — there is no runtime auto-detection. Native-first is the default because local-model FC support has matured into the more reliable path; prompt-injection stays fully supported as an explicit opt-in, but note that on more complex, multi-step interactions models tend to struggle to drive the prompt-injected protocol reliably, so reach for it only when the backend leaves no alternative.
 
-> **Proxy note:** the OpenAI-compatible proxy is **native-first**. By default (`--backend-capability native`) it forwards the client's tools verbatim to an FC-capable backend (llama.cpp with `--jinja`, vLLM, Ollama, Anthropic) — the recommended setup. For a non-FC llama.cpp/llamafile backend, opt into prompt-injection with `--backend-capability prompt` (strips tools into the prompt, parses the JSON call back; reuses the same prompt path as the WorkflowRunner). The choice is frozen at startup — there is no runtime auto-detect in the proxy. Reasoning replay is controlled separately with `--reasoning-replay {full,keep-last,none}`; the default `keep-last` replays only the latest captured reasoning block to the backend when that reasoning is available in the conversation history. See ADR-012.
+> **Proxy note:** the OpenAI-compatible proxy is **native-first**. By default (`--backend-capability native`) it forwards the client's tools verbatim to an FC-capable backend (llama.cpp with `--jinja`, vLLM, Ollama, Anthropic) — the recommended setup. For a non-FC llama.cpp/llamafile backend, opt into prompt-injection with `--backend-capability prompt` (strips tools into the prompt, parses the JSON call back; reuses the same prompt path as the WorkflowRunner). The choice is frozen at startup — there is no runtime auto-detect in the proxy. Reasoning replay is controlled separately with `--reasoning-replay {full,keep-last,none}`; the default `none` keeps captured reasoning out of backend-facing history (`keep-last` replays only the latest captured reasoning block, `full` replays everything). See ADR-012.
 
 Smoke-test:
 
diff --git a/docs/MODEL_REGISTRY.md b/docs/MODEL_REGISTRY.md
index 370a14b..9d84d4c 100644
--- a/docs/MODEL_REGISTRY.md
+++ b/docs/MODEL_REGISTRY.md
@@ -4,7 +4,7 @@ Every model forge knows about, classified by eval-suite status.
 
 ## Status meanings
 
-- **Current** — in the published eval. The dashboard folds multiple eval *generations* into one view (the v0.7.0 8–14B lineup, plus the v0.7.4 32GB tier); runs not yet re-swept against the latest code — e.g. the Anthropic ablation — are carried forward and superscript-tagged. Numbers in [`docs/results/`](results/) and the [dashboard](results/dashboard.html).
+- **Current** — in the published eval. The dashboard folds multiple eval *generations* into one view (the v0.7.5 reasoning-replay grid for the 8–14B lineup and Claude tier, plus the v0.7.4 32GB tier); runs not yet re-swept against the latest code — e.g. the 32GB tier and the Claude deep-ablation rows — are carried forward and superscript-tagged. Numbers in [`docs/results/`](results/) and the [dashboard](results/dashboard.html).
 - **Retired** — appeared in a prior eval suite, cut from the current one. Either too weak (bare scores below the threshold for informative comparison) or superseded by a newer family member. Sampling defaults retained for backward compatibility.
 - **Unpublished** — sampling defaults are present, but no eval numbers have been published. Forge will work with these models; performance is undocumented.
 
@@ -20,7 +20,7 @@ Sampling values are sourced from the model's HuggingFace card unless noted. Valu
 | Ministral-3 14B Instruct 2512 | Q4_K_M | 0.05¹ | — | — | — | — | — | [HF](https://huggingface.co/mistralai/Ministral-3-14B-Instruct-2512) |
 | Ministral-3 8B Reasoning 2512 | Q4_K_M, Q8_0 | 0.7 | —² | — | — | — | — | [HF](https://huggingface.co/mistralai/Ministral-3-8B-Reasoning-2512) |
 | Ministral-3 14B Reasoning 2512 | Q4_K_M | 1.0 | —² | — | — | — | — | [HF](https://huggingface.co/mistralai/Ministral-3-14B-Reasoning-2512) |
-| Qwen3 8B | Q4_K_M, Q8_0 | 0.6 | 0.95 | 20 | 0.0 | — | — | [HF](https://huggingface.co/Qwen/Qwen3-8B) |
+| Qwen3 8B | Q4_K_M, Q8_0⁸ | 0.6 | 0.95 | 20 | 0.0 | — | — | [HF](https://huggingface.co/Qwen/Qwen3-8B) |
 | Qwen3 14B | Q4_K_M | 0.6 | 0.95 | 20 | 0.0 | — | — | [HF](https://huggingface.co/Qwen/Qwen3-14B) |
 | Granite 4.1 8B | Q4_K_M, Q8_0 | 0.0³ | 1.0 | 0 | — | — | — | (IBM convention, unconfirmed) |
 | Gemma-4 E4B-it | Q4_K_M, Q8_0 | 1.0 | 0.95 | 64 | — | — | — | [HF](https://huggingface.co/google/gemma-4-e4b-it) |
@@ -33,15 +33,16 @@ Sampling values are sourced from the model's HuggingFace card unless noted. Valu
 | Nemotron-3 Nano 30B-A3B | Q4_K_M | 0.6 | 0.95 | — | — | — | —⁷ | [HF](https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16) |
 | Claude Haiku 4.5⁵ | — | — | — | — | — | — | — | (SDK-managed) |
 | Claude Sonnet 4.6⁵ | — | — | — | — | — | — | — | (SDK-managed) |
-| Claude Opus 4.6⁵ | — | — | — | — | — | — | — | (SDK-managed) |
+| Claude Opus 4.8⁵ | — | — | — | — | — | — | — | (SDK-managed) |
 
 ¹ Ministral-3 Instruct cards say "temperature below 0.1 for production"; 0.05 picked within that range.
 ² Ministral-3 Reasoning cards show `top_p=0.95` in code examples but do NOT include it in the formal "Recommended Settings" section. Add explicitly if you want to follow the examples.
 ³ Granite 4.1 sampling mirrors the Granite 4.0 IBM convention (greedy decoding); marked unconfirmed pending IBM publication for the 4.1 family specifically.
 ⁴ Phi-4: no formal sampling recommendation from any official source (Microsoft HF card, model docs). Falls through to backend defaults.
-⁵ **Claude numbers are carried forward from the v0.6.0 dataset** — gen 1 on the dashboard, superscript-tagged. The Anthropic ablation has not been re-run since, owing to cost (~$272 for the full 11,700-row matrix). Backend support is unchanged; numbers are stable to within tool-error-channel sensitivity (small).
+⁵ **Claude baseline re-measured in the v0.7.5 dataset** with extended thinking enabled (adaptive) for Sonnet 4.6 and Opus 4.8; Haiku 4.5 does not support adaptive thinking and runs non-thinking. Earlier Claude rows ran thinking-off: Opus 4.6 and the Anthropic deep-ablation rows are carried forward from the v0.6.0 dataset (gen 1 on the dashboard, superscript-tagged) — the ablation has not been re-run owing to cost (~$272 for the full 11,700-row matrix).
 ⁶ Qwen3.6 27B (dense) deliberately diverges from its A3B siblings: its card drops the `presence_penalty=1.5` the MoE variants recommend, so forge sends `0.0` (no penalty).
 ⁷ Nemotron-3 Nano: the card splits sampling into a Reasoning preset (T=1.0, top_p=1.0) and a Tool-calling preset (T=0.6, top_p=0.95); the tool-calling preset is used here, with thinking enabled via `chat_template_kwargs`.
+⁸ **Qwen3 8B Q8_0 will be cut (→ Retired) in a future eval generation** on compute-cost vs signal-value grounds, not quality: it was the single most expensive model in the v0.7.5 grid (~108 GPU-hours, ~23% of the full sweep) while adding little information over its Q4_K_M sibling (the Q4/Q8 delta is a couple of points on a mid-board model, and the quant-comparison axis is preserved by the cheaper Ministral and Gemma Q4/Q8 pairs). Its numbers stay Current while they are part of the published dataset.
 
 ---
 
diff --git a/docs/USER_GUIDE.md b/docs/USER_GUIDE.md
index e8ec90f..de10ec1 100644
--- a/docs/USER_GUIDE.md
+++ b/docs/USER_GUIDE.md
@@ -85,7 +85,7 @@ claude
 
 **Function-calling capability.** `--backend-capability native` (default) uses the backend's chat-template tool-calling and is the smoother default for Claude Code's heavy multi-turn tool use. `--backend-capability prompt` injects the tool surface into the prompt for llama.cpp/llamafile backends without a tool-calling template; whether a model stays coherent across multi-turn tool results in prompt mode varies by model — and tends to degrade on more complex, multi-step interactions — so prefer native whenever the backend supports it. The capability is declared at startup and frozen.
 
-**Reasoning replay.** Reasoning-capable backends may return hidden reasoning alongside tool calls. Forge captures that reasoning for observability, then controls how much is replayed to the backend on later turns with `--reasoning-replay {full,keep-last,none}`. The default is `keep-last`: only the latest captured reasoning block is replayed. `full` preserves the historical behavior and replays every captured reasoning block. `none` keeps reasoning out of backend-facing history. In OpenAI-compatible proxy responses, `keep-last` exposes current reasoning as `reasoning_content` instead of normal assistant `content` so clients that preserve reasoning fields can replay only the latest block without turning it into plain text. Anthropic proxy responses only emit reasoning text under `full`; Forge does not synthesize signed Anthropic thinking blocks, so default Anthropic proxy responses do not expose replayable reasoning.
+**Reasoning replay.** Reasoning-capable backends may return hidden reasoning alongside tool calls. Forge captures that reasoning for observability, then controls how much is replayed to the backend on later turns with `--reasoning-replay {full,keep-last,none}`. The default is `none`: captured reasoning stays out of backend-facing history entirely. This is the most token-efficient policy, and on forge's eval suite it is statistically indistinguishable from replay-all (no aggregate score cost; see [reasoning-replay results](results/raw/reasoning-replay.md)). `keep-last` replays only the latest captured reasoning block. `full` preserves the historical behavior and replays every captured reasoning block. In OpenAI-compatible proxy responses, `keep-last` exposes current reasoning as `reasoning_content` instead of normal assistant `content` so clients that preserve reasoning fields can replay only the latest block without turning it into plain text; under the default `none`, proxy responses omit captured reasoning. Anthropic proxy responses only emit reasoning text under `full`; Forge does not synthesize signed Anthropic thinking blocks, so default Anthropic proxy responses do not expose replayable reasoning. See [ADR-017](decisions/017-reasoning-replay-policy.md) for the policy design and the eval evidence behind the default.
 
 **Downstream protocol.**
 
@@ -285,7 +285,7 @@ await server.stop()
 
 `WorkflowRunner` accepts an optional `on_message` callback that fires each time a `Message` is appended to the conversation during `run()`. This is the primary observability hook — use it for logging, eval metric collection, or building conversation history for multi-turn flows.
 
-`WorkflowRunner(reasoning_replay=...)` uses the same policy as the proxy: `keep-last` by default, `full` for the historical replay-all behavior, and `none` to avoid replaying captured reasoning to the backend. The policy affects backend-facing serialization only; `MessageType.REASONING` entries still appear in `on_message` and internal history unless context compaction removes them.
+`WorkflowRunner(reasoning_replay=...)` uses the same policy as the proxy: `none` by default (captured reasoning is not replayed to the backend), `keep-last` to replay only the latest reasoning block, and `full` for the historical replay-all behavior. The policy affects backend-facing serialization only; `MessageType.REASONING` entries still appear in `on_message` and internal history unless context compaction removes them.
 
 - **Single-turn (default):** `on_message` fires for every message the runner creates — system prompt, user input, assistant responses, tool results, nudges.
 - **Multi-turn (`initial_messages`):** `run()` accepts an optional `initial_messages` parameter that seeds the conversation with prior history. `on_message` fires **only for new messages created during this turn**, not for the replayed history.
diff --git a/docs/decisions/017-reasoning-replay-policy.md b/docs/decisions/017-reasoning-replay-policy.md
new file mode 100644
index 0000000..e2bba03
--- /dev/null
+++ b/docs/decisions/017-reasoning-replay-policy.md
@@ -0,0 +1,51 @@
+# ADR-017: Reasoning replay is a bounded policy, default `none`
+
+**Status:** accepted (unreleased)
+
+## Context
+
+Reasoning-capable backends (Ministral Reasoning, Qwen3 thinking, gemma 4, …) return hidden reasoning alongside tool calls. forge captures that reasoning for observability (`MessageType.REASONING`), and historically re-serialized **all** of it into backend-facing history on every later turn — unbounded accumulation, with no way to turn it off.
+
+Two problems motivated bounding this:
+
+- **Convergence.** A proxy non-convergence investigation traced runaway context growth to captured reasoning being replayed back to the backend each turn. Frontier labs practice *scoped* reasoning retention, not replay-everything.
+- **Cost.** Replayed reasoning grows the prompt every turn. On long multi-step workflows it competes with real history for the context budget and inflates per-turn token cost.
+
+A serializer reality check sharpened the question: even the legacy behavior was not a faithful 1:1 re-send — `fold_and_serialize` collapses consecutive reasoning blocks (only the one preceding a tool call survives), so only ~29% of generated reasoning reached the wire on real transcripts. "Replay everything" was already an approximation, not a ground truth worth preserving by default.
+
+## Decision
+
+One knob, `reasoning_replay ∈ {"full", "keep-last", "none"}`, shared by `WorkflowRunner` and the proxy (`--reasoning-replay`), **default `"none"`**.
+
+- **`none` (default)** — captured reasoning never enters backend-facing history.
+- **`keep-last`** — only the most recent captured reasoning block is replayed.
+- **`full`** — legacy behavior; every captured reasoning block is replayed. Pre-knob forge ≡ `full`.
+
+The policy affects **backend-facing serialization only**. Reasoning is still captured, still surfaces in `on_message` and internal history, and still lands in eval transcripts — observability is unchanged.
+
+Proxy response shaping follows the policy: under `keep-last` current reasoning is exposed as `reasoning_content` (so clients that preserve reasoning fields can replay just the latest block); under `full` it rides assistant `content`; under `none` it is omitted. Anthropic-protocol responses emit reasoning text only under `full`; forge does not synthesize signed Anthropic thinking blocks.
+
+## Evidence
+
+The default was chosen from a dedicated re-sweep (the v0.7.5 grid): 14 models × {none, keep-last, full} × {bare, reforged} × {native, prompt}, 50 runs × 26 scenarios per cell, 170k runs total. Scoring treats the **scenario** as the sampling unit (runs cluster hard within scenarios), paired against the v0.7.0 legacy/`full` baseline.
+
+- **`full` reproduces the pre-knob baseline** on all reasoning models (n.s. everywhere) — the knob is a clean superset of legacy behavior; the message-processing refactor did not regress the legacy path.
+- **`none` is statistically indistinguishable from legacy overall** (+0.49pp, p=0.17), and in the reforged-only read (−0.35pp, p=0.45). Bounding replay is a free token saving on this suite.
+- **`none` edges out `keep-last` overall** (+0.86pp, p=0.007); the two are indistinguishable reforged-only.
+- **No robust per-config downside survives multiple-comparison correction.** The closest is the Ministral-14B-Reasoning-Q4 family (reforged-only raw drop ~1.5pp, p≈0.04–0.06, with `none` ≈ `keep-last`) — a family/quantization caveat, not a blocker.
+- **Wire-level validation:** `none` → exactly 0 reasoning on the wire across every row; `keep-last` ∈ {0, 1}; per-transcript ordering full ≥ keep-last ≥ none holds by construction.
+
+Full per-config tables: [results/raw/reasoning-replay.md](../results/raw/reasoning-replay.md).
+
+## Consequences
+
+- **Behavioral change for reasoning-capable backends.** Upgraders who want the old behavior pin `--reasoning-replay full` (proxy) or `WorkflowRunner(reasoning_replay="full")`. For non-reasoning/instruct models the knob is inert and nothing changes.
+- **Token savings by default.** Backend-facing history stops accumulating reasoning; `full` remains the cost wildcard (context grows with run length).
+- **Eval surface.** `reasoning_replay` is part of the eval resume key and a first-class report/dashboard dimension; rows predating the knob count as `full` (that is what they ran).
+- **Claude rows are unaffected.** The Anthropic client drops returned thinking blocks rather than capturing them into history, so the knob is request-inert there; carrying thinking across turns natively is deferred pending evidence it moves scores.
+
+## Alternatives considered
+
+- **Default `keep-last`** (the knob's initial default while evidence was pending). A reasonable middle ground — but it measured slightly *below* `none` overall, still pays a replay cost, and busts rolling prompt-cache prefixes (earlier messages re-serialize differently each turn). Rejected once the grid showed `none` is quality-free.
+- **Default `full` (legacy).** Preserves bug-for-bug continuity, but it is the most expensive policy, delivers no measured score benefit, and is the very accumulation pathology that motivated the knob.
+- **Drop replay entirely (no knob).** Simplest, but unfalsifiable — `full`/`keep-last` exist precisely so the policy stays a measured variable and per-model exceptions (e.g. the Ministral-Q4 caveat) remain one flag away.

From 8ba0f1eb03101e04e9b37a5eb8eb2f130a1f6c9e Mon Sep 17 00:00:00 2001
From: Antoine Zambelli <antoine.zambelli@gmail.com>
Date: Thu, 11 Jun 2026 20:40:28 -0500
Subject: [PATCH 14/14] release: v0.7.5 version bump + changelog

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---
 CHANGELOG.md   | 13 +++++++++++++
 pyproject.toml |  2 +-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index e4e0a3f..f6961eb 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,6 +2,19 @@
 
 All notable changes to forge are documented here.
 
+## [0.7.5] — 2026-06-11
+
+Reasoning replay is now a measured, bounded policy. Reasoning-capable backends return hidden reasoning alongside tool calls, and forge previously re-serialized all of it into backend-facing history on every later turn. The new `reasoning_replay` knob bounds that — and after a full re-sweep of the published eval grid showed that dropping replayed reasoning is quality-free and token-cheaper, the default is `none`. The release also re-baselines the Claude eval tier with extended thinking enabled and adds Anthropic prompt caching with cache-aware cost accounting.
+
+### Added
+- **`reasoning_replay {full, keep-last, none}`** on `WorkflowRunner(reasoning_replay=…)` and the proxy (`--reasoning-replay`). `full` replays every captured reasoning block (the historical behavior), `keep-last` only the most recent, `none` keeps reasoning out of backend-facing history entirely. Serialization-only: reasoning is still captured and still surfaces in `on_message` and internal history. In OpenAI-compatible proxy responses, `keep-last` exposes current reasoning as `reasoning_content` rather than assistant `content`, so clients that preserve reasoning fields can replay just the latest block. See [ADR-017](docs/decisions/017-reasoning-replay-policy.md).
+- **Reasoning-replay eval grid** (`eval_results_v0.7.5.jsonl`, a new eval generation): the full 8–14B lineup re-swept across all three policies × both ablations × native/prompt — ~170k runs. The policy is part of the eval resume key and a first-class report/dashboard dimension: row labels carry `:keep-last` / `:full` tags (untagged = `none`), the dashboard gains a Reasoning Replay filter, the report a `--reasoning-replay` filter, and a dedicated [reasoning-replay view](docs/results/raw/reasoning-replay.md) compares policies per config. A wire-level counter (`reasoning_wire`) validates each policy's on-wire behavior (`none` → exactly 0 replayed reasoning across every run).
+- **Anthropic extended thinking — `AnthropicClient(thinking=…)`** — request-side extended-thinking config (e.g. `{"type": "adaptive"}`). When set, a forced `tool_choice` is suppressed (the API requires `auto` with thinking on) and `max_tokens` is raised to fit the thinking budget. The Claude eval baseline now runs Sonnet and Opus with adaptive thinking — all prior Claude rows had thinking off, the wrong baseline for a reasoning-flavored suite; Haiku does not support adaptive thinking and stays non-thinking.
+- **Anthropic prompt caching — `AnthropicClient(prompt_caching=True)`** — marks a static ephemeral cache breakpoint over the tool definitions + system prompt (byte-identical every turn, so it read-hits from turn 2 onward instead of re-billing the re-sent schema). `TokenUsage` gains generic `cache_creation_input_tokens` / `cache_read_input_tokens` counters, and eval cost accounting prices cache writes (1.25×) and reads (0.1×) at their actual rates.
+
+### Changed
+- **Captured reasoning is no longer replayed to the backend by default.** Pre-0.7.5 behavior replayed every captured reasoning block (equivalent to `reasoning_replay="full"`); the default is now `"none"`. On the published eval suite, `none` is statistically indistinguishable from replay-all in aggregate while saving the replayed tokens every turn; no per-config regression survives multiple-comparison correction (closest: a mild raw drop on Ministral-3 14B Reasoning Q4, where `none` and `keep-last` are indistinguishable from each other). The knob is inert for models that emit no reasoning. Migration: `--reasoning-replay full` (proxy) or `WorkflowRunner(reasoning_replay="full")` restores the historical behavior. Anthropic-protocol proxy responses emit reasoning text only under `full` — forge does not synthesize signed Anthropic thinking blocks.
+
 ## [0.7.4] — 2026-06-03
 
 Malformed tool-call arguments now self-correct on the tool-error channel, and the eval suite gains its first model-size upgrade — a 32GB tier (Qwen3.5 / 3.6 27–35B, Nemotron-3 Nano, Mistral-Small-3.2) surfaced in the dashboard alongside the existing 8–14B lineup.
diff --git a/pyproject.toml b/pyproject.toml
index 05826c9..c42aa85 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 
 [project]
 name = "forge-guardrails"
-version = "0.7.4"
+version = "0.7.5"
 description = "A reliability layer for self-hosted LLM tool-calling. Guardrails, context management, and backend adapters for multi-step agentic workflows."
 requires-python = ">=3.12"
 license = "MIT"