Parse Holistics dataset metrics, standalone Metric, and PartialDataset.extend#204
Parse Holistics dataset metrics, standalone Metric, and PartialDataset.extend#204nicosuave wants to merge 10 commits into
Conversation
…aset.extend, richer AQL Dataset-level metric/dimension blocks, standalone top-level Metric blocks, and PartialDataset blocks composed via .extend() were silently dropped. - Surface Dataset blocks (and Dataset = base.extend(partials) assignments) as models named after the dataset, carrying their cross-model AQL dimensions and metrics. Dataset metrics also register as graph-level metrics. - Parse standalone top-level Metric blocks as graph-level metrics. - Resolve PartialDataset blocks for .extend() composition, reusing the existing partial/extend merge machinery. - Handle @aql measures with no aggregation_type (the recommended Holistics style) without losing them. - Best-effort AQL support for where()/filter()/group()/select() table funcs, of_all()/exclude()/relative_period() metric modifiers, two-arg aggregation (sum(table, expr)), and nested aggregation, preserving the inner aggregation. - Dataset dimensions authored with an aggregate AQL surface as derived metrics rather than groupable dimensions.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b6d39f338b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Dataset metrics parsed from Dataset blocks were only attached to the synthetic dataset model. SemanticGraph.add_model() auto-registers only time_comparison/conversion metric types, so derived/simple dataset metrics were missing from graph.metrics, list_metrics(), and unqualified get_metric(). Register them at graph scope on parse, mirroring standalone Metric block handling.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3f750dee06
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
_resolve_block_from_value only materialized Model/PartialModel typed
blocks, so a dataset declared via inline assignment (Dataset foo =
Dataset {...} / PartialDataset {...}) resolved to None and was silently
dropped along with its dataset-level metrics and dimensions. Widen the
typed-block branch to also handle Dataset/PartialDataset.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 53a2664b17
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…tion
_collect_dataset_definitions only stored object-assignment items of kind
Dataset, so a PartialDataset declared via assignment (PartialDataset x =
PartialDataset {...}) could not be resolved by a later base.extend(x),
silently dropping the partial's metrics. Collect PartialDataset
assignments too, while keeping them as composition fragments rather than
standalone queryable models (matching named PartialDataset blocks).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9cd56f5bd5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ent relationships
Two symmetric gaps for assignment-form (x = Type { ... }) AML declarations:
- Standalone metrics in inline assignment form (Metric x = Metric { ... })
were not collected, so they were absent from graph.metrics. Collect them
into metric_blocks alongside block-form metrics.
- Relationships declared inside an inline Dataset assignment were never
parsed, since the relationship pass only iterated top-level Dataset
AmlBlocks. Surface resolved assignment-dataset blocks and attach their
join edges, so cross-model dataset metrics have a join path.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9d4d536e7f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…xt across extend When a Dataset (or Model) extends a Partial declared in another module/file, field definitions were resolved against the consuming file's context, so a metric/dimension whose definition references a const or use-alias from the partial's own module imported as the literal identifier instead of its AQL. Stamp each composed field block with its defining file context and resolve constants/use-aliases against that context during parsing.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6ace04c501
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f803a3afde
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
When a Dataset extends a PartialDataset from another module and the partial overrides a same-named metric/dimension, _merge_blocks rebuilt the merged child block without its context, so the per-field fallback (item.context or context) resolved against the extending dataset's file. A field whose definition referenced a const from the partial's module imported as the literal const identifier instead of the const's AQL. Carry the context onto merged blocks, preferring the overriding extension's context and falling back to the base's.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 460d54b01e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
When a Dataset extends a PartialDataset from another module and that partial contributes a relationships property, qualify the relationship refs against the partial's authoring module rather than the consuming file. Previously refs like rel(orders.customer_id > customers.id) were qualified against the root/consumer file, so edges targeting models such as finance.orders/finance.customers were skipped. Stamp the originating context onto AmlProperty items during cross-module composition (mirroring the existing child-field context fix) and use the relationships property's origin context when parsing dataset relationships.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b7e53f25a6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
When a field block composed across modules has a property authored in one file (e.g. a finance metric's definition: rev_def const) while a sibling property is overridden from another file, the merged block carries a single block-level context. Resolve each property against its own stamped origin context so constants and use aliases bind to the file that defined them rather than the file that overrode an unrelated property.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 81bcff2d92
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for item in block.items: | ||
| if not isinstance(item, AmlBlock): | ||
| continue |
There was a problem hiding this comment.
Import metric reference entries in datasets
When a PartialDataset uses the documented reusable-metric-store shorthand metric total_orders: total_orders, the parser represents that reference as non-AmlBlock items, so this continue drops it. A Dataset d = base.extend(partial_metrics) whose partial only references standalone metrics then produces no d dataset model at all, even though the standalone Metric exists; the new PartialDataset.extend support only works when every reusable metric is rewritten as a full inline block.
Useful? React with 👍 / 👎.
What changed
The Holistics AML adapter silently dropped dataset-level metric/dimension blocks, standalone top-level
Metricblocks, andPartialDatasetblocks composed via.extend(). Datasets that contained these produced no semantic entities at all.This change adds post-parse handling (no grammar changes were required — the ANTLR grammar already produces the right blocks) to surface them:
Dataset { ... }blocks (andDataset x = base.extend(partials)assignments) now surface as a model named after the dataset, carrying their cross-model AQL dimensions and metrics. Dataset metrics also register as graph-level metrics, matching Holistics' first-class dataset-metric semantics.Metricblocks: top-levelMetric name { definition: @aql ... }blocks now register as graph-level metrics (independently reusable, per the Holistics reusable-metric-store pattern).PartialDataset+.extend():PartialDatasetblocks resolve and compose into datasets via.extend(), reusing the existing partial/extend merge machinery.@aqlmeasures with noaggregation_type: the recommended Holistics style (definition purely via AQL, noaggregation_type) is preserved instead of being dropped.Upstream AQL features now supported (best-effort)
Confirmed against docs.holistics.io (Holistics 4.0 AML):
where(),filter(),group(),select()in a pipeline pass through so the surrounding aggregation still produces the value.of_all(),exclude(),keep_grains(),relative_period()(period-over-period) preserve the inner aggregation rather than dropping it.sum(table, expr).sum(x) / count(y)), with the base pipeline segment now translated as well.Tests / fixtures
transactions.dataset.aml/kitchen_sink.dataset.amlfixtures, which previously parsed to nothing).metric_store.amlfixture exercising the reusable-metric-store pattern and the richer AQL functions.Known limitations