automatic STT keyterm detection by longcw · Pull Request #6039 · livekit/agents

longcw · 2026-06-10T12:03:31Z

Adds automatic keyterm detection to AgentSession, biasing the STT toward the correct spelling of distinctive words (names, companies, products, jargon) as they come up in the conversation.

Overview

New keyterm_options on AgentSession: user-defined terms plus a detection config (enabled, llm, turn_interval, max_keyterms, instructions).
KeytermDetector runs a background LLM pass per user turn over the recent transcript and maintains the keyterm set with a confirmation gate: a new term starts as pending and only biases the STT once later transcript evidence confirms it; remove only applies to spellings the user explicitly corrected, and the replacement goes through the same confirmation flow.
The default prompt treats USER lines as untrusted STT output and ASSISTANT lines as authoritative spelling, and explicitly rejects misrecognitions: sound-alike variants of already-tracked terms, garbled phrases the assistant never adopts, and fragments from interrupted lines.
Detection state is owned by the session so keyterms survive agent handoffs; user-defined terms are shown to the detection LLM as applied but are never modified by it.
New STT.update_keyterms() with a keyterms capability flag, implemented for deepgram (v1/v2), assemblyai, google, and livekit inference STT; the fallback and stream adapters forward it.

Show user-defined terms to the detection LLM as applied so it stops re-proposing them every pass; add misrecognition rules to the default prompt (sound-alike variants of tracked terms, user-only garbled phrases, interrupted-line fragments) and route corrections through the normal confirmation gate.

User-defined keyterms were silently dropped unless detection.enabled was set: start() returned before binding the STT, so the initial push and later set_user_keyterms() were no-ops. Bind the STT unconditionally (skipping the push when there are no terms, so sessions without keyterms see no capability warning or reconnect) and gate only the detection setup on enabled.

chenghao-mou · 2026-06-10T14:37:48Z

+        keyterm_options={
+            "terms": ["LiveKit"],
+            "detection": {"enabled": True, "turn_interval": 1},
+        },


For conversations where some context/user information is available before the call (e.g from a patient/customer profile loaded when starting), should we allow extracting keyterms from such context first?

perhaps keep that on the developer side, they can pass the context like address, user name via keyterm_options={"terms": [...]} once the profile loads.

chenghao-mou · 2026-06-10T14:43:09Z

+    "enabled": False,
+    "llm": None,
+    "turn_interval": 1,
+    "max_keyterms": None,


some of vendors impose a max limit, maybe we should check this in the extract-for-model function.

devin-ai-integration

Devin Review found 1 new potential issue.

View 8 additional findings in Devin Review.

devin-ai-integration · 2026-06-11T04:41:03Z

+    def update_keyterms(self, keyterms: list[str]) -> None:
+        # Google biases recognition via (phrase, boost) pairs; apply a moderate
+        # default boost since the common keyterms API carries no per-term weight.
+        self.update_options(keywords=[(term, _DEFAULT_KEYTERM_BOOST) for term in keyterms])


📝 Info: Google STT _update_keyterms merges user keywords with auto-detected keyterms using a default boost

At livekit-plugins/livekit-plugins-google/livekit/plugins/google/stt.py:556-570, the Google STT's _update_keyterms merges the provider-agnostic keyterms (which have no per-term weight) with the user's manually-tuned keywords list (which have explicit boosts). A default boost of 10.0 (_DEFAULT_KEYTERM_BOOST at line 74) is applied to auto-detected terms. This is a reasonable heuristic but may need tuning — Google accepts boosts roughly in the 0–20 range, and 10.0 is moderate. Users who need different boost values should use the Google-specific keywords parameter directly.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration

Devin Review found 1 new potential issue.

devin-ai-integration · 2026-06-11T08:37:26Z

+    def _update_keyterms(self, keyterms: list[str]) -> None:
+        # Google biases recognition via (phrase, boost) pairs; apply a moderate
+        # default boost since the common keyterms API carries no per-term weight.
+        self.update_options(keywords=[(term, _DEFAULT_KEYTERM_BOOST) for term in keyterms])


🚩 Google STT claims keyterms=True even when adaptation would shadow keywords

The Google STT plugin now unconditionally sets keyterms=True in its capabilities (livekit-plugins/livekit-plugins-google/livekit/plugins/google/stt.py:234). However, _update_keyterms uses update_options(keywords=...) which is shadowed by an existing adaptation config (stt.py:106-109 — build_adaptation returns adaptation first, ignoring keywords). If a user configures both adaptation and enables keyterm detection, detected terms are stored but never reach the recognizer. The existing warning at stt.py:512-515 covers this partially, but the keyterms capability claim might mislead the keyterm detector into running LLM passes whose results are silently discarded.

Was this helpful? React with 👍 or 👎 to provide feedback.

chenghao-mou · 2026-06-13T20:51:49Z

                endpoint_url=endpoint_url,
            )

+    def _update_keyterms(self, keyterms: list[str]) -> None:


Only the flux models can support mid-stream keyterm update: https://developers.deepgram.com/docs/keyterm#dynamic-keyterm-updates-flux-only should we disable this for nova models?

longcw added 6 commits June 9, 2026 21:04

wip

59f1a58

update fnc tool

ebbe1b5

clean

8234757

refactor to KeytermDetector

73eb033

fix test

a6096af

This comment was marked as resolved.

Sign in to view

chenghao-mou reviewed Jun 10, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into longc/auto-stt-keyterms

2ca544a

devin-ai-integration Bot reviewed Jun 11, 2026

View reviewed changes

make _update_keyterms private

ffd105c

devin-ai-integration Bot reviewed Jun 11, 2026

View reviewed changes

fix(google): preserve user-tuned keywords on keyterms update

e7b7c92

theomonnom mentioned this pull request Jun 12, 2026

voice: output retries for run(output_type=...) #6080

Open

chenghao-mou reviewed Jun 13, 2026

View reviewed changes

cursor Bot mentioned this pull request Jun 14, 2026

docs: daily engineering digest for 2026-06-14 Seventhen/agents#5

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

automatic STT keyterm detection#6039

automatic STT keyterm detection#6039
longcw wants to merge 10 commits into
mainfrom
longc/auto-stt-keyterms

longcw commented Jun 10, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

chenghao-mou Jun 10, 2026

Uh oh!

longcw Jun 11, 2026

Uh oh!

chenghao-mou Jun 10, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Jun 11, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Jun 11, 2026 •

edited

Loading

Uh oh!

chenghao-mou Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

longcw commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Uh oh!

This comment was marked as resolved.

Uh oh!

chenghao-mou Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

longcw Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

chenghao-mou Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chenghao-mou Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

longcw commented Jun 10, 2026 •

edited

Loading

devin-ai-integration Bot Jun 11, 2026 •

edited

Loading

devin-ai-integration Bot Jun 11, 2026 •

edited

Loading