feat: dosu read/write commands, workflow hook, and PAT FTUE fixes#73
feat: dosu read/write commands, workflow hook, and PAT FTUE fixes#73brandonyoungdev wants to merge 2 commits into
Conversation
Lands the workflow-hook FTUE for `dosu setup --agent --tool claude`.
Experimental data shows this pattern drives ~10x the baseline knowledge-
capture rate over passive MCP-only setups (see
testing-c/dosu-interventions-findings.md).
What's new
- `dosu read [query]` and `dosu write "<fact>"` — CLI commands that call
the new /v1/knowledge/search and /v1/topics backend endpoints.
- Claude Code workflow-hook installer (installWorkflowHook in
mcp/config-helpers.ts) — `dosu setup` now writes a UserPromptSubmit
hook reminding the model to `dosu write` non-obvious facts at
end-of-task. Idempotent install + remove.
- `silent` option on installSkill() so the FTUE doesn't surface raw
npx security warnings.
What's fixed
- PAT storage timeout — was hitting the 10s HTTP timeout on slow KMS
calls. Now uses a 60s budget for /github-pat specifically
(Client.doRequest({timeoutMs})), retries on 403/404 for read-after-
write race, and surfaces backend `detail` from the new 503 handler.
- Setup resumability — if a prior run created the data source but
failed at PAT storage, rerunning detects it via
findExistingGithubDataSource and skips straight to the PAT prompt.
- doc-analyze messaging — distinguishes indexed-and-empty / data
source missing / indexing didn't complete. The misleading
"No docs found" no longer fires on real sync failures.
- Recoverability audit — every error path in setup/, commands/
{read,write,ask}, and agent/ now ends with the exact recovery
command. Top-level error handler appends `dosu logs --tail 30`.
- Drive-by: knowledge.ts and suggest.ts were reading `ds.id` from
dataSource.list — actual field is `ds.data_source_id`.
Test plan
- vitest 1106/1106 pass; 5 new test files
- tsc, biome clean
- Manually verified PAT storage end-to-end against local dev backend
There was a problem hiding this comment.
Code Review
This pull request introduces several key enhancements to the CLI, including new read and write commands for interacting with the knowledge base, a direct GitHub Personal Access Token (PAT) connection flow, and automatic document analysis during onboarding. It also adds a workflow hook for Claude Code to prompt agents to save facts, and improves error handling and timeout configurations. The review feedback highlights critical issues to address: preventing a potential infinite loop in the document import polling, safeguarding local configuration from being wiped during transient network failures in deployment verification, making the GitHub URL detection regex more robust to support credentials, and validating that input facts are not empty or whitespace-only in the write command.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| async function pollImportCompletion( | ||
| trpc: TypedClient, | ||
| taskID: string, | ||
| ): Promise<{ imported: number; failed: number }> { | ||
| let errors = 0; | ||
| while (true) { | ||
| try { | ||
| const status = (await trpc.docImports.getImportStatus.query( | ||
| taskID, | ||
| )) as ImportTaskStatusResponse | null; | ||
|
|
||
| if (!status) { | ||
| errors += 1; | ||
| if (errors >= IMPORT_STATUS_MAX_ERRORS) return { imported: 0, failed: 0 }; | ||
| } else { | ||
| errors = 0; | ||
| if (status.state !== "PROGRESS") { | ||
| const detail = status.detail; | ||
| return { | ||
| imported: detail?.completed ?? 0, | ||
| failed: detail?.failed ?? 0, | ||
| }; | ||
| } | ||
| } | ||
| } catch (err: unknown) { | ||
| errors += 1; | ||
| logger.warn( | ||
| "setup", | ||
| `getImportStatus error: ${err instanceof Error ? err.message : String(err)}`, | ||
| ); | ||
| if (errors >= IMPORT_STATUS_MAX_ERRORS) return { imported: 0, failed: 0 }; | ||
| } | ||
| await sleep(IMPORT_STATUS_POLL_INTERVAL_MS); | ||
| } | ||
| } |
There was a problem hiding this comment.
The pollImportCompletion function uses an infinite while (true) loop to poll the import status. If the backend task gets stuck in the PROGRESS state indefinitely, this loop will run forever, hanging the CLI setup process. A deadline or maximum timeout check should be introduced to ensure the loop terminates.
async function pollImportCompletion(
trpc: TypedClient,
taskID: string,
): Promise<{ imported: number; failed: number }> {
let errors = 0;
const deadline = Date.now() + 120_000; // 2-minute timeout
while (Date.now() < deadline) {
try {
const status = (await trpc.docImports.getImportStatus.query(
taskID,
)) as ImportTaskStatusResponse | null;
if (!status) {
errors += 1;
if (errors >= IMPORT_STATUS_MAX_ERRORS) return { imported: 0, failed: 0 };
} else {
errors = 0;
if (status.state !== "PROGRESS") {
const detail = status.detail;
return {
imported: detail?.completed ?? 0,
failed: detail?.failed ?? 0,
};
}
}
} catch (err: unknown) {
errors += 1;
logger.warn(
"setup",
"getImportStatus error: " + (err instanceof Error ? err.message : String(err)),
);
if (errors >= IMPORT_STATUS_MAX_ERRORS) return { imported: 0, failed: 0 };
}
await sleep(IMPORT_STATUS_POLL_INTERVAL_MS);
}
logger.warn("setup", "pollImportCompletion timed out waiting for task completion");
return { imported: 0, failed: 0 };
}| const deployments = await fetchDeployments(apiClient); | ||
| const exists = deployments.some((d) => d.deployment_id === cfg.deployment_id); | ||
| if (!exists) { | ||
| p.log.warn( | ||
| `Project "${cfg.deployment_name ?? cfg.deployment_id}" no longer exists — select a new one.`, | ||
| ); | ||
| cfg.deployment_id = undefined; | ||
| cfg.deployment_name = undefined; | ||
| cfg.api_key = undefined; | ||
| saveConfig(cfg); | ||
| const ok = await resolveDeployment(apiClient, cfg, opts); | ||
| if (!ok) { | ||
| await trackCliOnboardingEvent(cfg, onboardingRunID, "cli_onboarding_failed", { | ||
| reason: "deployment_resolution_failed", | ||
| }); | ||
| return; | ||
| } | ||
| } |
There was a problem hiding this comment.
If fetchDeployments fails due to a transient network error or API downtime, it returns an empty array []. This causes exists to be false, which immediately triggers a destructive configuration wipe (clearing deployment_id, deployment_name, and api_key) and saves it. To prevent network hiccups from breaking the user's local configuration, we should wrap the deployment verification in a try/catch block and only wipe the configuration if the API request succeeds and explicitly confirms the deployment no longer exists.
let deployments: Deployment[] | null = null;
try {
deployments = await apiClient.getDeployments();
} catch (err) {
logger.warn("setup", "Failed to verify existing deployment due to network/API error: " + err);
}
if (deployments !== null) {
const exists = deployments.some((d) => d.deployment_id === cfg.deployment_id);
if (!exists) {
p.log.warn(
"Project \"" + (cfg.deployment_name ?? cfg.deployment_id) + "\" no longer exists — select a new one."
);
cfg.deployment_id = undefined;
cfg.deployment_name = undefined;
cfg.api_key = undefined;
saveConfig(cfg);
const ok = await resolveDeployment(apiClient, cfg, opts);
if (!ok) {
await trackCliOnboardingEvent(cfg, onboardingRunID, "cli_onboarding_failed", {
reason: "deployment_resolution_failed",
});
return;
}
}
}| let m = url.match(/^git@github\.com:([^/]+)\/([^/]+?)(?:\.git)?$/i); | ||
| if (!m) { | ||
| m = url.match(/^https?:\/\/github\.com\/([^/]+)\/([^/]+?)(?:\.git)?\/?$/i); | ||
| } |
There was a problem hiding this comment.
The current regular expression for matching HTTPS GitHub URLs does not support URLs containing optional credentials (e.g., https://username:token@github.com/owner/repo.git), which are common in automated environments or CI/CD setups. Updating the regex to handle optional credentials makes the repository detection more robust.
let m = url.match(/^git@github\\.com:([^/]+)\\/([^/]+?)(?:\\.git)?$/i);
if (!m) {
m = url.match(/^https?:\\/\\/(?:[^@]+@)?github\\.com\\/([^/]+)\\/([^/]+?)(?:\\.git)?\\/?$/i);
}| .action(async (fact: string, opts: { json?: boolean }) => { | ||
| const cfg = requireConfig(); | ||
| const apiClient = new Client(cfg); | ||
|
|
||
| logger.debug("write", `Saving topic: ${fact.slice(0, 60)}`); | ||
|
|
||
| const resp = await apiClient.doRequest("POST", "/v1/topics", { | ||
| // biome-ignore lint/style/noNonNullAssertion: checked in requireConfig | ||
| deployment_id: cfg.deployment_id!, | ||
| name: fact.trim().split(/\s+/).slice(0, 8).join(" "), | ||
| context: fact, | ||
| }); |
There was a problem hiding this comment.
If the user passes a whitespace-only string as the <fact> argument (e.g., " "), fact.trim() becomes empty, resulting in an empty name and context being sent to the backend. It is better to validate that the trimmed fact is not empty before making the API request to prevent unnecessary network calls and potential backend validation errors.
.action(async (fact: string, opts: { json?: boolean }) => {
const trimmedFact = fact.trim();
if (trimmedFact.length === 0) {
console.error(pc.red("Error: Fact cannot be empty or whitespace-only."));
process.exit(1);
}
const cfg = requireConfig();
const apiClient = new Client(cfg);
logger.debug("write", "Saving topic: " + trimmedFact.slice(0, 60));
const resp = await apiClient.doRequest("POST", "/v1/topics", {
// biome-ignore lint/style/noNonNullAssertion: checked in requireConfig
deployment_id: cfg.deployment_id!,
name: trimmedFact.split(/\\s+/).slice(0, 8).join(" "),
context: trimmedFact,
});Three fixes from Gemini Code Assist review (skipping the GitHub URL
credentials regex tweak — marginal value for interactive CLI use).
Fix 1 — pollImportCompletion infinite loop (HIGH)
- Add a 5-minute deadline so a stuck backend doc-import task can't
hang setup forever.
- Return { imported, failed, timed_out: true } on timeout so the caller
can surface a meaningful warning instead of a silent { imported: 0 }.
- Caller in stepAnalyzeDocs now logs a recovery hint pointing at
`dosu docs import-status` / `dosu sources list`.
Fix 2 — config wipe on transient network failure (HIGH, data loss)
- The deployment-verification path was calling fetchDeployments(), which
swallows errors and returns []. Combined with the "deployment not
found → wipe local config" branch, this meant any transient API
outage silently corrupted the user's deployment_id, deployment_name,
and api_key in ~/.config/dosu-cli/config.json.
- Now calls apiClient.getDeployments() directly, catches errors locally,
logs a warning, and preserves the cached config. The wipe-and-re-pick
branch only runs when the API explicitly returns a deployment list
that excludes our id.
Fix 4 — dosu write rejects empty / whitespace-only input
- One-line validation at the top of the action. Exit with a helpful
example instead of round-tripping garbage to /v1/topics for a 422.
Tests
- New src/commands/write.test.ts (5 tests) — empty / whitespace
rejection, trimming, backend error surfacing, missing-config exit.
- New src/setup/doc-analyze-step.test.ts (6 tests) — SUCCESS, FAILURE,
timeout, max-errors (null + throw), recovery.
- New src/setup/flow.test.ts tests (2) — preserves config on network
failure, still wipes on explicit "not found".
Full sweep: 1119/1119 pass (+13 from 1106). typecheck clean. biome clean.
Summary
Lands the workflow-hook FTUE for
dosu setup --agent --tool claude. Experimental data shows this pattern drives ~10× the baseline knowledge-capture rate over passive MCP-only setups (see dosu-interventions-findings.md).What's new
dosu read [query]anddosu write "<fact>"— CLI commands that call the new/v1/knowledge/searchand/v1/topicsbackend endpoints. Designed as the primary write path for end-of-task knowledge capture.installWorkflowHookinmcp/config-helpers.ts) —dosu setupnow writes aUserPromptSubmithook that reminds the model todosu writenon-obvious facts at end-of-task. Idempotent install + remove.silentoption oninstallSkill()so the FTUE doesn't surface raw npx security warnings.What's fixed
/github-patspecifically (Client.doRequest({timeoutMs})), retries on 403/404 for read-after-write race, and surfaces backenddetailfrom the new 503 handler instead of a generic "could not store".findExistingGithubDataSourceand skips straight to the PAT prompt. No more duplicate workspaces.setup/,commands/{read,write,ask}, andagent/now ends with the exact recovery command. Top-level error handler appendsdosu logs --tail 30.knowledge.tsandsuggest.tswere readingds.idfromdataSource.list— actual field isds.data_source_id.Test plan
npx vitest run— 1106/1106 pass (added storePat / findExistingGithubDataSource / patStorageFailureMessage / ClienttimeoutMs/ doc-analyze outcome tests)npx tsc --noEmitcleannpx biome check .cleandosu setupinteractive flow + resumability of a prior failed run (bun run dev:local setup)Release impact
feat:commit will cause semantic-release to cut0.16.0on merge to main.Companion PRs
dosu read/writewill 404 on prod: https://github.com/dosu-ai/dosu/pull/10843