feat: add assay results fetching to packaging state machine#69
Open
drernie wants to merge 1 commit into
Open
Conversation
Open
drernie
added a commit
that referenced
this pull request
Jun 15, 2026
…ks.json (#143 #389) (#390) * feat: add entry-reference extractor (shared discovery for #143 + #68/#69) Pure parser over a Benchling entry dict that surfaces the objects an entry points at, in one place for both upcoming features: - extract_entity_references(): entity IDs from days[].notes[].links[] (filtered to entity types, dropping non-entity links like sql_dashboard) and from entity-link fields; deduped by ID. -> #143 entity packaging. - extract_results_tables(): results_table notes carrying assayResultSchemaId. -> #68/#69 assay results. - extract_note_links(): low-level all-links primitive. No Benchling API calls and no behavior change -- nothing consumes it yet, so it lands independently of either feature. 13 unit tests; black/isort/pyright clean. Refs #143 #68 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat: full EntryLink type map + classify_links; fix entity set (#389) Generalize the discovery layer from entities-only to the full EntryLink enum (18 types from test/openapi.yaml), per #389. - classify_links(entry): surfaces ALL note links, each labeled with a LinkCategory (entity/inventory/reference/metadata/not_packageable/uncertain/ external/unknown). Consumers filter, e.g. `r.is_packageable`. Unknown/future types surface as UNKNOWN rather than being silently dropped. - LINK_TYPE_CATEGORY: type -> category for all 18 tokens; PACKAGEABLE_CATEGORIES. - Fix entity set: ENTITY_LINK_TYPES now {custom_entity, dna_sequence, aa_sequence, batch}. Adds `batch` (a real registry entity, was missed); drops dna_oligo/rna_oligo (NOT EntryLink types -- can't appear as note links). - spec/entry-link-types.json: human-facing reference map (category, packageable, id prefix, GET endpoint, webhook events) for all 18 types, plus the not-inline-linkable resources. A test asserts its categories match the module so it can't drift. 26 unit tests; black/isort/pyright clean. Refs #143 #389 #68 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat: write references.json into each package; address review When packaging an entry, write a references.json alongside entry.json listing the Benchling objects the entry points at (entities, classified links, results tables), discovered from the entry's note links and fields. No records are fetched -- discovery only. - entry_references.summarize_references(entry): JSON-serializable payload ({schema_version, entities, links, results_tables}); REFERENCES_SCHEMA_VERSION. - entry_packager._create_metadata_files: emit references.json + document it in the package README. Review fixes (Greptile + Copilot): - Drop empty-string entity IDs in _field_value_ids, matching the note-link guard. - Narrow RESULTS_TABLE_NOTE_TYPES to {"results_table"} -- the only type carrying assayResultSchemaId; avoids latently capturing generic/registration tables. - Modernize typing (dict/list/tuple) on the 3.11+ codebase. - Remove committed spec/entry-link-types.json (relocated to the project's scripts/ as a research artifact) and its drift-guard test. Full suite green (437). Refs #143 #389 #68 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test: regression test for empty-string entity-link field values Confirms _field_value_ids drops empty strings (single + isMulti list), matching the note-link guard. Requested in review. Refs #143 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor: replace overloaded `packageable` with fetchable + eventable + disposition `packageable` conflated two questions: can a record be fetched, vs. should it live in its own package or nested in the entry. Split into orthogonal axes and a derived disposition. - FETCHABLE_CATEGORIES (GET-by-id exists) + EVENTABLE_CATEGORIES (own webhooks, can arrive independent of an entry) + CATEGORY_DISPOSITION. - LinkRef.is_fetchable / is_eventable / disposition (replaces is_packageable). - references.json links now carry {category, fetchable, eventable, disposition}. disposition makes explicit that nest-vs-standalone is a genuine product decision ONLY for entities (fetchable AND eventable -> nest_or_standalone). Non-entities are forced: inventory -> nest (no events); entry/request/workflow -> link (own package); metadata -> pointer; dashboards/external -> skip. Project artifact scripts/entry-link-types.json updated to match (not in repo). Refs #143 #389 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat: searchable links metadata + raw links.json (entity name enrichment) Promote a curated `links` array into entry.json (the package's metadata_uri) so packages are searchable by the human-readable name of the entities/objects an entry references — the top-line use case from the 2026-06-15 call ("show me all experiments where QB-2743.1 was used"). Each curated link carries four fields, each with one job: - type, id — free; id supports downstream linking - name — authoritative Benchling display name via best-effort GET-by-id, or null when the lookup fails/isn't supported (never a slug) - slug — lossy token parsed from the webURL, for eyeballing/debugging only Verified the human name is NOT recoverable from the webURL: the trailing segment is a lowercased, punctuation-flattened slug (sBN000 -> sbn000), so the exact name must come from the API. Name resolution is best-effort and never raises — it requires the app to be a registry/project collaborator. Also rename references.json -> links.json and reduce it to raw facts only (id/type/web_url + entities + results_tables). Derived classifications (category/fetchable/eventable/disposition) are no longer persisted; they are recomputed from type in code, so the raw archive stays reprocessable and a future classification change needs no re-fetch. Schema bumped to v2. Refs #143 #389 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore: bump version to 0.17.3 * docs: changelog for 0.17.3 (searchable links metadata + links.json) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore: bump version to 0.18.0 * docs: align changelog heading to 0.18.0 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: clarify note-link shape — id and webURL are optional Note links carry a required type plus optional id (external link has none) and optional webURL (e.g. location has none). Addresses PR #390 review. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
#68 support Assay Results