Audit
registry.json lists 55 jurisdictions, but the labels: [working] flag in actions/pipeline-manager/chn-openstates-files.yml (the publish config) is what actually controls whether each one ships data. Four jurisdictions are listed without that label, and the live govbot-data repos confirm the problem — they exist as near-empty stubs:
| Code |
Name |
govbot-data repo size |
Likely problem |
az |
Arizona |
22 KB |
empty/broken — investigate scraper |
ct |
Connecticut |
5 MB |
partial data — formatter likely failing on something CT-specific |
tx |
Texas |
1 MB |
partial data — sessionizer or Texas-specific formatter likely failing |
va |
Virginia |
22 KB |
empty/broken — investigate scraper |
Each needs an individual root-cause pass. Open the repo, look at the latest workflow runs in the Actions tab, identify the upstream/format failure, and either fix it or document why it's stuck.
Related signal — scrape side has more failures
19 jurisdictions appear in chn-openstates-scrape.yml without labels: [working] (AK, AL, AZ, CA, CO, CT, HI, ID, IN, MA, MD, ME, MI, MS, NY, OH, SD, VA + DC). The files half of the pipeline still publishes for most of these (it's the consumer-facing side), so they're scraper-job failures rather than user-visible data gaps. The 4 above are the ones that are user-visible — start there.
Suggested order of attack
- AZ + VA (22 KB each) — almost certainly a scraper crash that never produces output. Cheapest wins first; one fix likely unblocks both if it's a shared root cause.
- CT + TX — partial data means the scrape runs but the formatter or sessionizer drops records partway. Deeper debugging; expect state-specific edge cases.
Acceptance
- AZ, CT, TX, VA repos in govbot-data have meaningful content (comparable size to neighbors like NM, NV).
- Each one has
labels: [working] in actions/pipeline-manager/chn-openstates-files.yml.
govbot pull az ct tx va followed by govbot source --repos az-legislation ct-legislation tx-legislation va-legislation emits real records.
References
- Publish config:
actions/pipeline-manager/chn-openstates-files.yml
- Scrape config:
actions/pipeline-manager/chn-openstates-scrape.yml
- Registry:
actions/govbot/data/registry.json
- Pipeline-manager docs:
actions/pipeline-manager/README.md
- Live data org: https://github.com/govbot-data
Audit
registry.jsonlists 55 jurisdictions, but thelabels: [working]flag inactions/pipeline-manager/chn-openstates-files.yml(the publish config) is what actually controls whether each one ships data. Four jurisdictions are listed without that label, and the live govbot-data repos confirm the problem — they exist as near-empty stubs:azcttxvaEach needs an individual root-cause pass. Open the repo, look at the latest workflow runs in the Actions tab, identify the upstream/format failure, and either fix it or document why it's stuck.
Related signal — scrape side has more failures
19 jurisdictions appear in
chn-openstates-scrape.ymlwithoutlabels: [working](AK, AL, AZ, CA, CO, CT, HI, ID, IN, MA, MD, ME, MI, MS, NY, OH, SD, VA + DC). Thefileshalf of the pipeline still publishes for most of these (it's the consumer-facing side), so they're scraper-job failures rather than user-visible data gaps. The 4 above are the ones that are user-visible — start there.Suggested order of attack
Acceptance
labels: [working]inactions/pipeline-manager/chn-openstates-files.yml.govbot pull az ct tx vafollowed bygovbot source --repos az-legislation ct-legislation tx-legislation va-legislationemits real records.References
actions/pipeline-manager/chn-openstates-files.ymlactions/pipeline-manager/chn-openstates-scrape.ymlactions/govbot/data/registry.jsonactions/pipeline-manager/README.md