Skip to content

feat: dosu read/write commands, workflow hook, and PAT FTUE fixes#73

Open
brandonyoungdev wants to merge 2 commits into
mainfrom
brandonyoungdev/feat-workflow-hook-ftue
Open

feat: dosu read/write commands, workflow hook, and PAT FTUE fixes#73
brandonyoungdev wants to merge 2 commits into
mainfrom
brandonyoungdev/feat-workflow-hook-ftue

Conversation

@brandonyoungdev
Copy link
Copy Markdown

@brandonyoungdev brandonyoungdev commented Jun 3, 2026

Summary

Lands the workflow-hook FTUE for dosu setup --agent --tool claude. Experimental data shows this pattern drives ~10× the baseline knowledge-capture rate over passive MCP-only setups (see dosu-interventions-findings.md).

What's new

  • dosu read [query] and dosu write "<fact>" — CLI commands that call the new /v1/knowledge/search and /v1/topics backend endpoints. Designed as the primary write path for end-of-task knowledge capture.
  • Claude Code workflow-hook installer (installWorkflowHook in mcp/config-helpers.ts) — dosu setup now writes a UserPromptSubmit hook that reminds the model to dosu write non-obvious facts at end-of-task. Idempotent install + remove.
  • silent option on installSkill() so the FTUE doesn't surface raw npx security warnings.

What's fixed

  • PAT storage timeout — was hitting the 10s HTTP timeout on slow KMS calls. Now uses a 60s budget for /github-pat specifically (Client.doRequest({timeoutMs})), retries on 403/404 for read-after-write race, and surfaces backend detail from the new 503 handler instead of a generic "could not store".
  • Setup resumability — if a prior run created the data source but failed at PAT storage, rerunning now detects the existing data source via findExistingGithubDataSource and skips straight to the PAT prompt. No more duplicate workspaces.
  • doc-analyze messaging — distinguishes three outcomes (indexed-and-empty / data source missing / indexing didn't complete). The misleading "No docs found" no longer fires on real sync failures.
  • Recoverability audit — every error path in setup/, commands/{read,write,ask}, and agent/ now ends with the exact recovery command. Top-level error handler appends dosu logs --tail 30.
  • Drive-by: knowledge.ts and suggest.ts were reading ds.id from dataSource.list — actual field is ds.data_source_id.

Test plan

  • npx vitest run — 1106/1106 pass (added storePat / findExistingGithubDataSource / patStorageFailureMessage / Client timeoutMs / doc-analyze outcome tests)
  • npx tsc --noEmit clean
  • npx biome check . clean
  • Manually verified PAT storage end-to-end against local dev backend (2s success, 164-byte ciphertext, no plaintext leak)
  • Reviewer: manually verify full dosu setup interactive flow + resumability of a prior failed run (bun run dev:local setup)

Release impact

feat: commit will cause semantic-release to cut 0.16.0 on merge to main.

Companion PRs

  • dosu backend — adds the endpoints this CLI calls. Must merge + deploy first or dosu read/write will 404 on prod: https://github.com/dosu-ai/dosu/pull/10843
  • dosu-skill (optional, trivial) — adds the two commands to the reference table.

Lands the workflow-hook FTUE for `dosu setup --agent --tool claude`.
Experimental data shows this pattern drives ~10x the baseline knowledge-
capture rate over passive MCP-only setups (see
testing-c/dosu-interventions-findings.md).

What's new
- `dosu read [query]` and `dosu write "<fact>"` — CLI commands that call
  the new /v1/knowledge/search and /v1/topics backend endpoints.
- Claude Code workflow-hook installer (installWorkflowHook in
  mcp/config-helpers.ts) — `dosu setup` now writes a UserPromptSubmit
  hook reminding the model to `dosu write` non-obvious facts at
  end-of-task. Idempotent install + remove.
- `silent` option on installSkill() so the FTUE doesn't surface raw
  npx security warnings.

What's fixed
- PAT storage timeout — was hitting the 10s HTTP timeout on slow KMS
  calls. Now uses a 60s budget for /github-pat specifically
  (Client.doRequest({timeoutMs})), retries on 403/404 for read-after-
  write race, and surfaces backend `detail` from the new 503 handler.
- Setup resumability — if a prior run created the data source but
  failed at PAT storage, rerunning detects it via
  findExistingGithubDataSource and skips straight to the PAT prompt.
- doc-analyze messaging — distinguishes indexed-and-empty / data
  source missing / indexing didn't complete. The misleading
  "No docs found" no longer fires on real sync failures.
- Recoverability audit — every error path in setup/, commands/
  {read,write,ask}, and agent/ now ends with the exact recovery
  command. Top-level error handler appends `dosu logs --tail 30`.
- Drive-by: knowledge.ts and suggest.ts were reading `ds.id` from
  dataSource.list — actual field is `ds.data_source_id`.

Test plan
- vitest 1106/1106 pass; 5 new test files
- tsc, biome clean
- Manually verified PAT storage end-to-end against local dev backend
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several key enhancements to the CLI, including new read and write commands for interacting with the knowledge base, a direct GitHub Personal Access Token (PAT) connection flow, and automatic document analysis during onboarding. It also adds a workflow hook for Claude Code to prompt agents to save facts, and improves error handling and timeout configurations. The review feedback highlights critical issues to address: preventing a potential infinite loop in the document import polling, safeguarding local configuration from being wiped during transient network failures in deployment verification, making the GitHub URL detection regex more robust to support credentials, and validating that input facts are not empty or whitespace-only in the write command.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/setup/doc-analyze-step.ts Outdated
Comment on lines +112 to +146
async function pollImportCompletion(
trpc: TypedClient,
taskID: string,
): Promise<{ imported: number; failed: number }> {
let errors = 0;
while (true) {
try {
const status = (await trpc.docImports.getImportStatus.query(
taskID,
)) as ImportTaskStatusResponse | null;

if (!status) {
errors += 1;
if (errors >= IMPORT_STATUS_MAX_ERRORS) return { imported: 0, failed: 0 };
} else {
errors = 0;
if (status.state !== "PROGRESS") {
const detail = status.detail;
return {
imported: detail?.completed ?? 0,
failed: detail?.failed ?? 0,
};
}
}
} catch (err: unknown) {
errors += 1;
logger.warn(
"setup",
`getImportStatus error: ${err instanceof Error ? err.message : String(err)}`,
);
if (errors >= IMPORT_STATUS_MAX_ERRORS) return { imported: 0, failed: 0 };
}
await sleep(IMPORT_STATUS_POLL_INTERVAL_MS);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The pollImportCompletion function uses an infinite while (true) loop to poll the import status. If the backend task gets stuck in the PROGRESS state indefinitely, this loop will run forever, hanging the CLI setup process. A deadline or maximum timeout check should be introduced to ensure the loop terminates.

async function pollImportCompletion(
  trpc: TypedClient,
  taskID: string,
): Promise<{ imported: number; failed: number }> {
  let errors = 0;
  const deadline = Date.now() + 120_000; // 2-minute timeout
  while (Date.now() < deadline) {
    try {
      const status = (await trpc.docImports.getImportStatus.query(
        taskID,
      )) as ImportTaskStatusResponse | null;

      if (!status) {
        errors += 1;
        if (errors >= IMPORT_STATUS_MAX_ERRORS) return { imported: 0, failed: 0 };
      } else {
        errors = 0;
        if (status.state !== "PROGRESS") {
          const detail = status.detail;
          return {
            imported: detail?.completed ?? 0,
            failed: detail?.failed ?? 0,
          };
        }
      }
    } catch (err: unknown) {
      errors += 1;
      logger.warn(
        "setup",
        "getImportStatus error: " + (err instanceof Error ? err.message : String(err)),
      );
      if (errors >= IMPORT_STATUS_MAX_ERRORS) return { imported: 0, failed: 0 };
    }
    await sleep(IMPORT_STATUS_POLL_INTERVAL_MS);
  }
  logger.warn("setup", "pollImportCompletion timed out waiting for task completion");
  return { imported: 0, failed: 0 };
}

Comment thread src/setup/flow.ts Outdated
Comment on lines +154 to +171
const deployments = await fetchDeployments(apiClient);
const exists = deployments.some((d) => d.deployment_id === cfg.deployment_id);
if (!exists) {
p.log.warn(
`Project "${cfg.deployment_name ?? cfg.deployment_id}" no longer exists — select a new one.`,
);
cfg.deployment_id = undefined;
cfg.deployment_name = undefined;
cfg.api_key = undefined;
saveConfig(cfg);
const ok = await resolveDeployment(apiClient, cfg, opts);
if (!ok) {
await trackCliOnboardingEvent(cfg, onboardingRunID, "cli_onboarding_failed", {
reason: "deployment_resolution_failed",
});
return;
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If fetchDeployments fails due to a transient network error or API downtime, it returns an empty array []. This causes exists to be false, which immediately triggers a destructive configuration wipe (clearing deployment_id, deployment_name, and api_key) and saves it. To prevent network hiccups from breaking the user's local configuration, we should wrap the deployment verification in a try/catch block and only wipe the configuration if the API request succeeds and explicitly confirms the deployment no longer exists.

    let deployments: Deployment[] | null = null;
    try {
      deployments = await apiClient.getDeployments();
    } catch (err) {
      logger.warn("setup", "Failed to verify existing deployment due to network/API error: " + err);
    }
    if (deployments !== null) {
      const exists = deployments.some((d) => d.deployment_id === cfg.deployment_id);
      if (!exists) {
        p.log.warn(
          "Project \"" + (cfg.deployment_name ?? cfg.deployment_id) + "\" no longer exists — select a new one."
        );
        cfg.deployment_id = undefined;
        cfg.deployment_name = undefined;
        cfg.api_key = undefined;
        saveConfig(cfg);
        const ok = await resolveDeployment(apiClient, cfg, opts);
        if (!ok) {
          await trackCliOnboardingEvent(cfg, onboardingRunID, "cli_onboarding_failed", {
            reason: "deployment_resolution_failed",
          });
          return;
        }
      }
    }

Comment on lines +85 to +88
let m = url.match(/^git@github\.com:([^/]+)\/([^/]+?)(?:\.git)?$/i);
if (!m) {
m = url.match(/^https?:\/\/github\.com\/([^/]+)\/([^/]+?)(?:\.git)?\/?$/i);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current regular expression for matching HTTPS GitHub URLs does not support URLs containing optional credentials (e.g., https://username:token@github.com/owner/repo.git), which are common in automated environments or CI/CD setups. Updating the regex to handle optional credentials makes the repository detection more robust.

  let m = url.match(/^git@github\\.com:([^/]+)\\/([^/]+?)(?:\\.git)?$/i);
  if (!m) {
    m = url.match(/^https?:\\/\\/(?:[^@]+@)?github\\.com\\/([^/]+)\\/([^/]+?)(?:\\.git)?\\/?$/i);
  }

Comment thread src/commands/write.ts
Comment on lines +34 to +45
.action(async (fact: string, opts: { json?: boolean }) => {
const cfg = requireConfig();
const apiClient = new Client(cfg);

logger.debug("write", `Saving topic: ${fact.slice(0, 60)}`);

const resp = await apiClient.doRequest("POST", "/v1/topics", {
// biome-ignore lint/style/noNonNullAssertion: checked in requireConfig
deployment_id: cfg.deployment_id!,
name: fact.trim().split(/\s+/).slice(0, 8).join(" "),
context: fact,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the user passes a whitespace-only string as the <fact> argument (e.g., " "), fact.trim() becomes empty, resulting in an empty name and context being sent to the backend. It is better to validate that the trimmed fact is not empty before making the API request to prevent unnecessary network calls and potential backend validation errors.

    .action(async (fact: string, opts: { json?: boolean }) => {
      const trimmedFact = fact.trim();
      if (trimmedFact.length === 0) {
        console.error(pc.red("Error: Fact cannot be empty or whitespace-only."));
        process.exit(1);
      }

      const cfg = requireConfig();
      const apiClient = new Client(cfg);

      logger.debug("write", "Saving topic: " + trimmedFact.slice(0, 60));

      const resp = await apiClient.doRequest("POST", "/v1/topics", {
        // biome-ignore lint/style/noNonNullAssertion: checked in requireConfig
        deployment_id: cfg.deployment_id!,
        name: trimmedFact.split(/\\s+/).slice(0, 8).join(" "),
        context: trimmedFact,
      });

Three fixes from Gemini Code Assist review (skipping the GitHub URL
credentials regex tweak — marginal value for interactive CLI use).

Fix 1 — pollImportCompletion infinite loop (HIGH)
- Add a 5-minute deadline so a stuck backend doc-import task can't
  hang setup forever.
- Return { imported, failed, timed_out: true } on timeout so the caller
  can surface a meaningful warning instead of a silent { imported: 0 }.
- Caller in stepAnalyzeDocs now logs a recovery hint pointing at
  `dosu docs import-status` / `dosu sources list`.

Fix 2 — config wipe on transient network failure (HIGH, data loss)
- The deployment-verification path was calling fetchDeployments(), which
  swallows errors and returns []. Combined with the "deployment not
  found → wipe local config" branch, this meant any transient API
  outage silently corrupted the user's deployment_id, deployment_name,
  and api_key in ~/.config/dosu-cli/config.json.
- Now calls apiClient.getDeployments() directly, catches errors locally,
  logs a warning, and preserves the cached config. The wipe-and-re-pick
  branch only runs when the API explicitly returns a deployment list
  that excludes our id.

Fix 4 — dosu write rejects empty / whitespace-only input
- One-line validation at the top of the action. Exit with a helpful
  example instead of round-tripping garbage to /v1/topics for a 422.

Tests
- New src/commands/write.test.ts (5 tests) — empty / whitespace
  rejection, trimming, backend error surfacing, missing-config exit.
- New src/setup/doc-analyze-step.test.ts (6 tests) — SUCCESS, FAILURE,
  timeout, max-errors (null + throw), recovery.
- New src/setup/flow.test.ts tests (2) — preserves config on network
  failure, still wipes on explicit "not found".

Full sweep: 1119/1119 pass (+13 from 1106). typecheck clean. biome clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant