Skip to content

Parse Holistics dataset metrics, standalone Metric, and PartialDataset.extend#204

Open
nicosuave wants to merge 10 commits into
mainfrom
update-holistics-adapter
Open

Parse Holistics dataset metrics, standalone Metric, and PartialDataset.extend#204
nicosuave wants to merge 10 commits into
mainfrom
update-holistics-adapter

Conversation

@nicosuave

Copy link
Copy Markdown
Member

What changed

The Holistics AML adapter silently dropped dataset-level metric/dimension blocks, standalone top-level Metric blocks, and PartialDataset blocks composed via .extend(). Datasets that contained these produced no semantic entities at all.

This change adds post-parse handling (no grammar changes were required — the ANTLR grammar already produces the right blocks) to surface them:

  • Dataset-level fields: Dataset { ... } blocks (and Dataset x = base.extend(partials) assignments) now surface as a model named after the dataset, carrying their cross-model AQL dimensions and metrics. Dataset metrics also register as graph-level metrics, matching Holistics' first-class dataset-metric semantics.
  • Standalone Metric blocks: top-level Metric name { definition: @aql ... } blocks now register as graph-level metrics (independently reusable, per the Holistics reusable-metric-store pattern).
  • PartialDataset + .extend(): PartialDataset blocks resolve and compose into datasets via .extend(), reusing the existing partial/extend merge machinery.
  • @aql measures with no aggregation_type: the recommended Holistics style (definition purely via AQL, no aggregation_type) is preserved instead of being dropped.

Upstream AQL features now supported (best-effort)

Confirmed against docs.holistics.io (Holistics 4.0 AML):

  • Table functions where(), filter(), group(), select() in a pipeline pass through so the surrounding aggregation still produces the value.
  • Metric modifiers of_all(), exclude(), keep_grains(), relative_period() (period-over-period) preserve the inner aggregation rather than dropping it.
  • Two-argument aggregation form sum(table, expr).
  • Nested aggregation in ratios (e.g. sum(x) / count(y)), with the base pipeline segment now translated as well.
  • Dataset dimensions authored with an aggregate AQL are surfaced as derived metrics, since an aggregate cannot be a groupable dimension.

Tests / fixtures

  • New assertions in the kitchen-sink test for dataset-level metrics and the standalone-Metric / PartialDataset.extend composition (realizing the existing transactions.dataset.aml / kitchen_sink.dataset.aml fixtures, which previously parsed to nothing).
  • New metric_store.aml fixture exercising the reusable-metric-store pattern and the richer AQL functions.
  • A focused unit test for the richer AQL translation.
  • Updated the cross-adapter fixture-contract registries to reflect that these datasets now produce semantic entities (cross-model metrics with no physical source of their own).

Known limitations

  • AQL metric-modifier scope changes (ignore-grouping, period shift) cannot be expressed in a standalone SQL fragment, so they are preserved as the inner aggregation rather than translated to window/scope semantics.
  • Dataset metrics span multiple models, so they are emitted as derived SQL rather than simple typed aggregations.

…aset.extend, richer AQL

Dataset-level metric/dimension blocks, standalone top-level Metric blocks,
and PartialDataset blocks composed via .extend() were silently dropped.

- Surface Dataset blocks (and Dataset = base.extend(partials) assignments) as
  models named after the dataset, carrying their cross-model AQL dimensions and
  metrics. Dataset metrics also register as graph-level metrics.
- Parse standalone top-level Metric blocks as graph-level metrics.
- Resolve PartialDataset blocks for .extend() composition, reusing the existing
  partial/extend merge machinery.
- Handle @aql measures with no aggregation_type (the recommended Holistics
  style) without losing them.
- Best-effort AQL support for where()/filter()/group()/select() table funcs,
  of_all()/exclude()/relative_period() metric modifiers, two-arg aggregation
  (sum(table, expr)), and nested aggregation, preserving the inner aggregation.
- Dataset dimensions authored with an aggregate AQL surface as derived metrics
  rather than groupable dimensions.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b6d39f338b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/holistics.py
Dataset metrics parsed from Dataset blocks were only attached to the
synthetic dataset model. SemanticGraph.add_model() auto-registers only
time_comparison/conversion metric types, so derived/simple dataset metrics
were missing from graph.metrics, list_metrics(), and unqualified
get_metric(). Register them at graph scope on parse, mirroring standalone
Metric block handling.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3f750dee06

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/holistics.py
Comment thread sidemantic/adapters/holistics.py
_resolve_block_from_value only materialized Model/PartialModel typed
blocks, so a dataset declared via inline assignment (Dataset foo =
Dataset {...} / PartialDataset {...}) resolved to None and was silently
dropped along with its dataset-level metrics and dimensions. Widen the
typed-block branch to also handle Dataset/PartialDataset.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 53a2664b17

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/holistics.py
Comment thread sidemantic/adapters/holistics.py
Comment thread sidemantic/adapters/holistics.py Outdated
…tion

_collect_dataset_definitions only stored object-assignment items of kind
Dataset, so a PartialDataset declared via assignment (PartialDataset x =
PartialDataset {...}) could not be resolved by a later base.extend(x),
silently dropping the partial's metrics. Collect PartialDataset
assignments too, while keeping them as composition fragments rather than
standalone queryable models (matching named PartialDataset blocks).

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9cd56f5bd5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/holistics.py
Comment thread sidemantic/adapters/holistics.py
…ent relationships

Two symmetric gaps for assignment-form (x = Type { ... }) AML declarations:
- Standalone metrics in inline assignment form (Metric x = Metric { ... })
  were not collected, so they were absent from graph.metrics. Collect them
  into metric_blocks alongside block-form metrics.
- Relationships declared inside an inline Dataset assignment were never
  parsed, since the relationship pass only iterated top-level Dataset
  AmlBlocks. Surface resolved assignment-dataset blocks and attach their
  join edges, so cross-model dataset metrics have a join path.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9d4d536e7f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/holistics.py
…xt across extend

When a Dataset (or Model) extends a Partial declared in another module/file,
field definitions were resolved against the consuming file's context, so a
metric/dimension whose definition references a const or use-alias from the
partial's own module imported as the literal identifier instead of its AQL.
Stamp each composed field block with its defining file context and resolve
constants/use-aliases against that context during parsing.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6ace04c501

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/holistics.py

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f803a3afde

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/holistics.py
When a Dataset extends a PartialDataset from another module and the
partial overrides a same-named metric/dimension, _merge_blocks rebuilt
the merged child block without its context, so the per-field fallback
(item.context or context) resolved against the extending dataset's file.
A field whose definition referenced a const from the partial's module
imported as the literal const identifier instead of the const's AQL.

Carry the context onto merged blocks, preferring the overriding
extension's context and falling back to the base's.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 460d54b01e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/holistics.py
When a Dataset extends a PartialDataset from another module and that
partial contributes a relationships property, qualify the relationship
refs against the partial's authoring module rather than the consuming
file. Previously refs like rel(orders.customer_id > customers.id) were
qualified against the root/consumer file, so edges targeting models such
as finance.orders/finance.customers were skipped.

Stamp the originating context onto AmlProperty items during
cross-module composition (mirroring the existing child-field context
fix) and use the relationships property's origin context when parsing
dataset relationships.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b7e53f25a6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/holistics.py
When a field block composed across modules has a property authored in one
file (e.g. a finance metric's definition: rev_def const) while a sibling
property is overridden from another file, the merged block carries a single
block-level context. Resolve each property against its own stamped origin
context so constants and use aliases bind to the file that defined them
rather than the file that overrode an unrelated property.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 81bcff2d92

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1371 to +1373
for item in block.items:
if not isinstance(item, AmlBlock):
continue

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Import metric reference entries in datasets

When a PartialDataset uses the documented reusable-metric-store shorthand metric total_orders: total_orders, the parser represents that reference as non-AmlBlock items, so this continue drops it. A Dataset d = base.extend(partial_metrics) whose partial only references standalone metrics then produces no d dataset model at all, even though the standalone Metric exists; the new PartialDataset.extend support only works when every reusable metric is rewritten as a full inline block.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant