Store ingestion AI responses for classification audit by nicholasjpanella · Pull Request #24 · baywire/baywire.app

nicholasjpanella · 2026-06-13T23:29:43Z

Summary

Adds an ai_responses table linked to ai_usage so ingestion LLM calls retain prompts and structured output for debugging classification decisions (e.g. why a place was labeled speakeasy).

Type of change

New feature

Testing

npm run typecheck
npm run db:push (schema applied to Neon)
npm run lint (pre-existing errors unrelated to this change)

Checklist

Ingestion changes fix the write path (no repair-only scripts)
No secrets committed

Notes for reviewers

New table: baywire.ai_responses with FK to ai_usage (cascade delete).
Wired through extended logAiUsage() for: discover, enrich, editorial (place + event), event extract, listing extract, suggest event.
Stores system_prompt, user_prompt, parsed_output, optional entity_type/entity_id, subject_name (pre-persist), taxonomy_version, prompt_revision.
Historical runs have no stored responses; data appears on the next discover/classify/scrape run.
Example query for a place after re-classify:

SELECT r.stage, r.created_at, r.parsed_output, r.user_prompt
FROM baywire.ai_responses r
WHERE r.entity_type = 'place' AND r.entity_id = '<uuid>'
ORDER BY r.created_at DESC;

Add ai_responses table linked to ai_usage so discovery, enrich, editorial, extract, and suggest LLM calls retain prompts and structured output for classification debugging. Wire all ingestion call sites through the extended logAiUsage helper. Co-authored-by: Nicholas P. <nicholasjpanella@users.noreply.github.com>

vercel · 2026-06-13T23:29:49Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
baywire-app	Building		Jun 13, 2026 11:30pm
baywire-app (test)	Ready	Preview	Jun 13, 2026 11:30pm

qodo-code-review · 2026-06-17T03:28:00Z

PR Summary by Qodo

Persist ingestion LLM prompts/outputs via ai_responses linked to ai_usage
✨ Enhancement ⚙️ Configuration changes 🕐 40+ Minutes

Description

• Add baywire.ai_responses to persist prompts and parsed LLM output for audit/debugging.
• Extend logAiUsage() to optionally write an AiResponse row alongside each usage record.
• Wire discover/enrich/editorial/extract/suggest call sites to pass prompts, output, and taxonomy
 metadata.

Diagram

graph TD
  Ingest["Ingestion jobs"] --> OpenAI{{"OpenAI API"}}
  Ingest --> Log["logAiUsage()"] --> Prisma["Prisma Client"] --> Usage[("ai_usage")]
  Prisma --> Resp[("ai_responses")]
  Usage --> Resp

  subgraph Legend
    direction LR
    _mod["Module"] ~~~ _db[("Table")] ~~~ _ext{{"External"}}
  end

High-Level Assessment

The following are alternative approaches to this PR:

1. Store prompts/output in ai_usage.meta JSON

➕ No new table/migration needed
➕ Single-row read to fetch usage + payload
➖ Harder to index/query by entity and time
➖ Mixes operational usage data with potentially large payloads
➖ Less explicit relational modeling (and harder to enforce constraints)

2. Write LLM payloads to object storage (S3/GCS) and store pointers

➕ Avoids storing large text/JSON in Postgres
➕ Cheaper long-term storage and easier retention policies
➖ More moving parts (bucket, permissions, lifecycle)
➖ Harder ad-hoc SQL debugging and joins
➖ Extra failure modes during ingestion

3. Centralized append-only audit log/event stream

➕ Strong audit semantics and replayability
➕ Can support multiple consumers and retention strategies
➖ Significantly higher implementation/operational complexity
➖ Overkill if primary use case is SQL debugging by entity

Recommendation: The PR’s approach (a normalized ai_responses table linked to ai_usage) is the best fit for SQL-first debugging and audit queries. It keeps usage/cost tracking separate from verbose prompts/output while enabling indexed lookups by entity and time, and avoids introducing new infrastructure like object storage or event streaming.

Files changed (20) +2241 / -12

Enhancement (8) +166 / -4

ai-usage.tsExtend logAiUsage() to optionally persist prompts and parsed outputs +39/-2

Extend logAiUsage() to optionally persist prompts and parsed outputs
• Adds 'AiResponseParams' and a 'response' field on 'AiUsageParams' to support nested creation of 'AiResponse' records. Includes 'serializeSystemPrompt()' to store multi-part system prompts (joined with delimiters).
src/lib/extract/ai-usage.ts

openai.tsLog extraction prompts and parsed event output to ai_responses +12/-0

Log extraction prompts and parsed event output to ai_responses
• Augments 'extractEvent()' logging to include stage, system/user prompts, and parsed structured output when successful. Also captures prompts on failure for post-mortem debugging.
src/lib/extract/openai.ts

discover.tsPersist discover prompts, parsed places output, and taxonomy metadata +22/-0

Persist discover prompts, parsed places output, and taxonomy metadata
• Captures multi-part system prompt content and logs 'discover' stage prompts/output via 'logAiUsage()' for auditing. Includes taxonomy version and prompt revision to correlate outputs with taxonomy changes.
src/lib/places/discover.ts

enrich.tsPersist enrich prompts and parsed place enrichment output +21/-2

Persist enrich prompts and parsed place enrichment output

• Extracts system/user prompt content from the message list and logs it along with parsed enrichment output. Expands meta to include 'searchType' and stores 'subjectName' for pre-persist correlation.

src/lib/places/enrich.ts

editorial.tsLog canonical event editorial prompts/output with entity linkage +20/-0

Log canonical event editorial prompts/output with entity linkage
• Adds 'response' logging for 'canonical_event' editorial calls, linking responses to the canonical event ID. Persists prompts, parsed output, and taxonomy/prompt revision metadata for auditability.
src/lib/extract/editorial.ts

editorialPlace.tsLog place editorial prompts/output with entity linkage +20/-0

Log place editorial prompts/output with entity linkage
• Adds 'response' logging for place editorial curation, linking stored responses to 'place' entity IDs. Persists prompts, parsed output, and taxonomy/prompt revision metadata for later inspection.
src/lib/extract/editorialPlace.ts

listings.tsLog listing extraction prompts and URL output payload +12/-0

Log listing extraction prompts and URL output payload
• Adds 'response' logging for listing extraction calls, storing system/user prompts and the extracted URLs as JSON output. Captures prompts on failure as well.
src/lib/extract/listings.ts

resolveEvent.tsLog suggest-event prompts and parsed result for auditing +20/-0

Log suggest-event prompts and parsed result for auditing
• Adds 'response' logging for suggest-event resolution, capturing stage, prompts, subject title, and parsed structured output. Also logs prompts in the 'no parsed output' failure path.
src/lib/suggest/resolveEvent.ts

Other (12) +2075 / -8

schema.prismaAdd AiResponse model and AiEntityType enum linked to AiUsage +35/-1

Add AiResponse model and AiEntityType enum linked to AiUsage

• Introduces 'AiEntityType' and a new 'AiResponse' model storing prompts, parsed output, and taxonomy/prompt metadata. Adds an 'AiUsage.aiResponses' relation with cascade delete to retain per-call audit data.

prisma/schema.prisma

migration.sqlCreate ai_responses table, enum, indexes, and FK to ai_usage +32/-0

Create ai_responses table, enum, indexes, and FK to ai_usage
• Adds Postgres enum 'baywire.AiEntityType', creates 'baywire.ai_responses', and indexes for entity/time and subject name/time lookups. Enforces FK to 'baywire.ai_usage' with 'ON DELETE CASCADE' for cleanup consistency.
prisma/migrations/20260613120000_ai_responses/migration.sql

enums.tsGenerate AiEntityType enum for Prisma client +9/-0

Generate AiEntityType enum for Prisma client
• Adds the generated 'AiEntityType' enum constants and type to match the new schema enum.
prisma/generated/enums.ts

commonInputTypes.tsGenerate filters for nullable AiEntityType fields +34/-0

Generate filters for nullable AiEntityType fields
• Adds generated Prisma input filter types for querying nullable 'AiEntityType' fields and their aggregate variants.
prisma/generated/commonInputTypes.ts

AiResponse.tsGenerate Prisma model/types for AiResponse +1641/-0

Generate Prisma model/types for AiResponse
• Adds the generated Prisma model file for 'AiResponse', including payload types, CRUD args, field refs, and aggregates. Enables typed access to 'ai_responses' via Prisma client.
prisma/generated/models/AiResponse.ts

AiUsage.tsGenerate AiUsage relation fields for aiResponses +169/-0

Generate AiUsage relation fields for aiResponses
• Updates generated 'AiUsage' types to include the 'aiResponses' relation and nested create/update inputs for writing responses alongside usage.
prisma/generated/models/AiUsage.ts

models.tsExport AiResponse model types from generated index +1/-0

Export AiResponse model types from generated index
• Re-exports the new 'AiResponse' generated model/types so consumers can import it from the Prisma generated barrel file.
prisma/generated/models.ts

client.tsExpose AiResponse type in generated Prisma client +5/-0

Expose AiResponse type in generated Prisma client
• Adds 'AiResponse' model type export to the generated client typings for app-level imports.
prisma/generated/client.ts

browser.tsExpose AiResponse type in browser Prisma bundle +5/-0

Expose AiResponse type in browser Prisma bundle
• Adds 'AiResponse' model type export in the generated browser entrypoint for parity with server client typings.
prisma/generated/browser.ts

prismaNamespace.tsRegister AiResponse model in Prisma internal namespace +110/-2

Register AiResponse model in Prisma internal namespace
• Updates Prisma internal model registry/type map to include 'AiResponse' operations and scalar enums, enabling generated client methods for the new model.
prisma/generated/internal/prismaNamespace.ts

prismaNamespaceBrowser.tsRegister AiResponse scalar fields in browser namespace +20/-1

Register AiResponse scalar fields in browser namespace
• Adds 'AiResponse' to the browser-side Prisma namespace registry, including scalar field enums needed for query building.
prisma/generated/internal/prismaNamespaceBrowser.ts

class.tsRegenerate Prisma runtime schema/model metadata for AiResponse +14/-4

Regenerate Prisma runtime schema/model metadata for AiResponse
• Updates the generated Prisma runtime metadata (inline schema and runtimeDataModel) to reflect the new enum/model and 'AiUsage.aiResponses' relation.
prisma/generated/internal/class.ts

qodo-code-review · 2026-06-17T03:38:54Z

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📜 Skill insights (0)

Context used

✅ Compliance rules (platform): 16 rules

1. Indexed subjectName too long 🐞 Bug ☼ Reliability

Description

AiResponse.subjectName is indexed and resolveSuggestedEvent now stores the unbounded user-provided
title into it; very long titles can exceed PostgreSQL btree index entry limits and make the
aiUsage+aiResponses insert fail, silently dropping the audit record. This undermines the PR’s main
goal (classification/debug auditability) because logAiUsage swallows write errors.

Code

src/lib/suggest/resolveEvent.ts[R109-114]

+      response: {
+        stage: "suggest",
+        subjectName: input.title,
+        systemPrompt: SYSTEM_PROMPT,
+        userPrompt: userContent,
+        parsedOutput: parsed as Prisma.InputJsonValue,

Relevance

⭐⭐⭐ High
Team accepted similar “don’t swallow errors; fail job” reliability fix in PR #19.
PR-#19

ⓘ Recommendations generated based on similar findings in past PRs

Evidence

The PR introduces an index on subject_name and starts persisting input.title into that indexed
column, but the title is not length-limited anywhere in the type definition. Because the DB write
happens inside logAiUsage() and errors are caught and only logged as a warning, insert failures
will silently drop the intended audit trail.

prisma/schema.prisma[468-488]
src/lib/suggest/types.ts[15-24]
src/lib/suggest/resolveEvent.ts[84-116]
src/lib/extract/ai-usage.ts[53-95]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`ai_responses` adds a btree index on `subject_name`, and the suggest pipeline now stores `input.title` into `AiResponse.subjectName`. Since `SuggestEventInput.title` is an unconstrained `string`, a sufficiently long title can cause PostgreSQL to reject the insert due to index tuple size limits, and because `logAiUsage()` catches and only warns, the audit record is silently lost.

### Issue Context
- `AiResponse.subjectName` is indexed (`@@index([subjectName, createdAt])`).
- `resolveSuggestedEvent()` uses `subjectName: input.title`.
- `SuggestEventInput.title` has no length bound.
- `logAiUsage()` swallows Prisma write failures.

### Fix Focus Areas
- prisma/schema.prisma[468-489]
- src/lib/suggest/types.ts[15-24]
- src/lib/suggest/resolveEvent.ts[84-116]
- src/lib/extract/ai-usage.ts[53-95]

### Suggested fix
1. Enforce a safe maximum length for `AiResponse.subjectName` (e.g. `@db.VarChar(512)` in Prisma + a migration).
2. Defensively truncate in the write path (e.g. `subjectName: params.response.subjectName?.slice(0, 512) ?? null`) so unexpected long strings never hit the DB/index.
3. Optionally, also validate/truncate `SuggestEventInput.title` at the boundary where user input enters the pipeline.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

vercel Bot deployed to test June 13, 2026 23:30 View deployment

nicholasjpanella marked this pull request as ready for review June 17, 2026 03:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Store ingestion AI responses for classification audit#24

Store ingestion AI responses for classification audit#24
nicholasjpanella wants to merge 1 commit into
mainfrom
dev/ai-response-storage-f70e

nicholasjpanella commented Jun 13, 2026

Uh oh!

vercel Bot commented Jun 13, 2026 •

edited

Loading

Uh oh!

qodo-code-review Bot commented Jun 17, 2026

Uh oh!

qodo-code-review Bot commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

nicholasjpanella commented Jun 13, 2026

Summary

Type of change

Testing

Checklist

Notes for reviewers

Uh oh!

vercel Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qodo-code-review Bot commented Jun 17, 2026

PR Summary by Qodo

Uh oh!

qodo-code-review Bot commented Jun 17, 2026

Code Review by Qodo

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented Jun 13, 2026 •

edited

Loading