Skip to content

feat: add assay results fetching to packaging state machine#69

Open
drernie wants to merge 1 commit into
mainfrom
68-assay-results
Open

feat: add assay results fetching to packaging state machine#69
drernie wants to merge 1 commit into
mainfrom
68-assay-results

Conversation

@drernie

@drernie drernie commented Apr 8, 2025

Copy link
Copy Markdown
Member

#68 support Assay Results

@drernie drernie linked an issue Apr 8, 2025 that may be closed by this pull request
drernie added a commit that referenced this pull request Jun 15, 2026
…ks.json (#143 #389) (#390)

* feat: add entry-reference extractor (shared discovery for #143 + #68/#69)

Pure parser over a Benchling entry dict that surfaces the objects an entry
points at, in one place for both upcoming features:

- extract_entity_references(): entity IDs from days[].notes[].links[]
  (filtered to entity types, dropping non-entity links like sql_dashboard)
  and from entity-link fields; deduped by ID. -> #143 entity packaging.
- extract_results_tables(): results_table notes carrying assayResultSchemaId.
  -> #68/#69 assay results.
- extract_note_links(): low-level all-links primitive.

No Benchling API calls and no behavior change -- nothing consumes it yet, so
it lands independently of either feature. 13 unit tests; black/isort/pyright
clean.

Refs #143 #68

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat: full EntryLink type map + classify_links; fix entity set (#389)

Generalize the discovery layer from entities-only to the full EntryLink enum
(18 types from test/openapi.yaml), per #389.

- classify_links(entry): surfaces ALL note links, each labeled with a
  LinkCategory (entity/inventory/reference/metadata/not_packageable/uncertain/
  external/unknown). Consumers filter, e.g. `r.is_packageable`. Unknown/future
  types surface as UNKNOWN rather than being silently dropped.
- LINK_TYPE_CATEGORY: type -> category for all 18 tokens; PACKAGEABLE_CATEGORIES.
- Fix entity set: ENTITY_LINK_TYPES now {custom_entity, dna_sequence,
  aa_sequence, batch}. Adds `batch` (a real registry entity, was missed);
  drops dna_oligo/rna_oligo (NOT EntryLink types -- can't appear as note links).
- spec/entry-link-types.json: human-facing reference map (category, packageable,
  id prefix, GET endpoint, webhook events) for all 18 types, plus the
  not-inline-linkable resources. A test asserts its categories match the module
  so it can't drift.

26 unit tests; black/isort/pyright clean.

Refs #143 #389 #68

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat: write references.json into each package; address review

When packaging an entry, write a references.json alongside entry.json listing
the Benchling objects the entry points at (entities, classified links, results
tables), discovered from the entry's note links and fields. No records are
fetched -- discovery only.

- entry_references.summarize_references(entry): JSON-serializable payload
  ({schema_version, entities, links, results_tables}); REFERENCES_SCHEMA_VERSION.
- entry_packager._create_metadata_files: emit references.json + document it in
  the package README.

Review fixes (Greptile + Copilot):
- Drop empty-string entity IDs in _field_value_ids, matching the note-link guard.
- Narrow RESULTS_TABLE_NOTE_TYPES to {"results_table"} -- the only type carrying
  assayResultSchemaId; avoids latently capturing generic/registration tables.
- Modernize typing (dict/list/tuple) on the 3.11+ codebase.
- Remove committed spec/entry-link-types.json (relocated to the project's
  scripts/ as a research artifact) and its drift-guard test.

Full suite green (437).

Refs #143 #389 #68

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: regression test for empty-string entity-link field values

Confirms _field_value_ids drops empty strings (single + isMulti list),
matching the note-link guard. Requested in review.

Refs #143

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor: replace overloaded `packageable` with fetchable + eventable + disposition

`packageable` conflated two questions: can a record be fetched, vs. should it
live in its own package or nested in the entry. Split into orthogonal axes and a
derived disposition.

- FETCHABLE_CATEGORIES (GET-by-id exists) + EVENTABLE_CATEGORIES (own webhooks,
  can arrive independent of an entry) + CATEGORY_DISPOSITION.
- LinkRef.is_fetchable / is_eventable / disposition (replaces is_packageable).
- references.json links now carry {category, fetchable, eventable, disposition}.

disposition makes explicit that nest-vs-standalone is a genuine product decision
ONLY for entities (fetchable AND eventable -> nest_or_standalone). Non-entities
are forced: inventory -> nest (no events); entry/request/workflow -> link (own
package); metadata -> pointer; dashboards/external -> skip.

Project artifact scripts/entry-link-types.json updated to match (not in repo).

Refs #143 #389

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat: searchable links metadata + raw links.json (entity name enrichment)

Promote a curated `links` array into entry.json (the package's metadata_uri)
so packages are searchable by the human-readable name of the entities/objects
an entry references — the top-line use case from the 2026-06-15 call ("show me
all experiments where QB-2743.1 was used").

Each curated link carries four fields, each with one job:
  - type, id  — free; id supports downstream linking
  - name      — authoritative Benchling display name via best-effort GET-by-id,
                or null when the lookup fails/isn't supported (never a slug)
  - slug      — lossy token parsed from the webURL, for eyeballing/debugging only

Verified the human name is NOT recoverable from the webURL: the trailing
segment is a lowercased, punctuation-flattened slug (sBN000 -> sbn000), so the
exact name must come from the API. Name resolution is best-effort and never
raises — it requires the app to be a registry/project collaborator.

Also rename references.json -> links.json and reduce it to raw facts only
(id/type/web_url + entities + results_tables). Derived classifications
(category/fetchable/eventable/disposition) are no longer persisted; they are
recomputed from type in code, so the raw archive stays reprocessable and a
future classification change needs no re-fetch. Schema bumped to v2.

Refs #143 #389

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore: bump version to 0.17.3

* docs: changelog for 0.17.3 (searchable links metadata + links.json)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore: bump version to 0.18.0

* docs: align changelog heading to 0.18.0

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: clarify note-link shape — id and webURL are optional

Note links carry a required type plus optional id (external link has none)
and optional webURL (e.g. location has none). Addresses PR #390 review.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

assay results

1 participant