Skip to content

feat: Add public safety training data pipeline#4

Merged
homanp merged 2 commits into
mainfrom
homanp/public-safety-training-data-pr
Jun 17, 2026
Merged

feat: Add public safety training data pipeline#4
homanp merged 2 commits into
mainfrom
homanp/public-safety-training-data-pr

Conversation

@homanp

@homanp homanp commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add a public safety training dataset pipeline that converts configured pairwise and single-response safety sources into the project schema.
  • Add the generated dataset summary and track the large training JSONL through Git LFS.
  • Update training defaults, Modal sampling, docs, and tests for the new data/training_data.jsonl flow.

Test plan

  • uv run pytest
  • Modal guarded generation smoke checks with the saved probe_out_public_safety probe

@open-cla

open-cla Bot commented Jun 17, 2026

Copy link
Copy Markdown

Contributor License Agreement

All contributors are covered by a CLA.

@superagent-security

Copy link
Copy Markdown

Superagent didn't find any vulnerabilities or security issues in this PR.

@homanp homanp changed the title Add public safety training data pipeline feat: Add public safety training data pipeline Jun 17, 2026
@homanp homanp self-assigned this Jun 17, 2026
@blacksmith-sh

This comment has been minimized.

@homanp homanp merged commit f6cc937 into main Jun 17, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant