Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .env
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,14 @@ MODELS=
# to use for internal tasks (title summarization, etc). If not set, the current model will be used
TASK_MODEL=

## Reasoning
# Reasoning-capable models stream their chain-of-thought. Set to "true" to also
# generate short, periodic natural-language summaries of that reasoning, shown as
# status updates while the model thinks. This issues additional LLM calls (using
# TASK_MODEL if set, otherwise the conversation model), so it is disabled by
# default. Leave empty/unset to disable.
REASONING_SUMMARY=

## LLM Router Configuration
# Path to routes policy (JSON array). Required when the router is enabled; must point to a valid JSON file.
# The router uses heuristic-based selection to pick the best model for each request.
Expand Down
10 changes: 10 additions & 0 deletions docs/source/configuration/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,16 @@ TASK_MODEL=meta-llama/Llama-3.1-8B-Instruct

If not set, the current conversation model is used.

## Reasoning

Reasoning-capable models stream their chain-of-thought. To also generate short, periodic natural-language summaries of that reasoning (shown as status updates while the model thinks), enable:

```ini
REASONING_SUMMARY=true
```

This issues additional LLM calls (using `TASK_MODEL` if set, otherwise the conversation model), so it is **disabled by default**. Leave it unset or empty to keep reasoning summaries off.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Correct the model used for reasoning summaries

When REASONING_SUMMARY=true, these periodic summary calls do not honor TASK_MODEL: generate.ts calls generateSummaryOfReasoning(reasoningBuffer, model.id, locals), and generateFromDefaultEndpoint chooses the supplied modelId before falling back to taskModel, so the conversation model is used whenever it is in the model list. This documentation can mislead operators who set TASK_MODEL to a cheaper or less rate-limited model and then enable the flag expecting the extra calls to go there.

Useful? React with 👍 / 👎.


## Voice Transcription

Enable voice input with Whisper:
Expand Down