Skip to content

feat(lookml): distinct/post-SQL measures and fiscal timeframes#209

Merged
nicosuave merged 4 commits into
mainfrom
update-lookml-adapter
Jun 15, 2026
Merged

feat(lookml): distinct/post-SQL measures and fiscal timeframes#209
nicosuave merged 4 commits into
mainfrom
update-lookml-adapter

Conversation

@nicosuave

Copy link
Copy Markdown
Member

What changed

Extends the LookML import adapter (sidemantic/adapters/lookml.py) to support several upstream measure types and dimension_group timeframes that were previously dropped or mishandled. Verified against the Looker LookML parameter reference (param-measure-types, param-field-dimension-group).

Distinct aggregate measures

sum_distinct, average_distinct, median_distinct, percentile_distinct now import as derived measures with an explicit DISTINCT aggregation (SUM(DISTINCT ...), AVG(DISTINCT ...), MEDIAN(DISTINCT ...), PERCENTILE_CONT(p) WITHIN GROUP (ORDER BY DISTINCT ...)). The sql_distinct_key parameter (which defines the unique entity used to avoid double counting under join fanout) is resolved and preserved in the metric meta so the de-duplication entity is not lost. percentile_distinct honors the percentile: parameter.

Post-SQL / table-calculation measures

  • running_total maps to a cumulative metric over the referenced base measure.
  • percent_of_total maps to a derived window measure: base / NULLIF(SUM(base) OVER (), 0).
  • percent_of_previous maps to a derived window measure using LAG(...) OVER ().

The referenced base measure (sql: ${measure}) is resolved to its dependency name, and the originating table-calculation kind is recorded in meta.

Non-standard / fiscal dimension_group timeframes

Timeframes that truncate a timestamp (time, date, week, month, quarter, year, plus minute/second variants and fiscal_quarter/fiscal_year) remain type: time dimensions with the appropriate granularity. "Extracted part" timeframes now produce numeric or categorical dimensions with portable DuckDB-compatible extraction SQL:

  • Numeric: month_num, week_of_year, quarter_of_year, hour_of_day, day_of_month, day_of_year, day_of_week_index, fiscal_month_num, fiscal_quarter_of_year.
  • Categorical: day_of_week, month_name.

fiscal_month_num / fiscal_quarter_of_year honor fiscal_month_offset when present (defaulting to a calendar offset of 0).

Why

An audit flagged these LookML constructs as unsupported. They appear in real-world Looker blocks already present in the test fixtures (e.g. GA360, redshift admin, pylookml), where extracted-part timeframes were previously coerced to day-granularity time dimensions and distinct/post-SQL measures were dropped or mis-typed.

Tests

New fixture tests/fixtures/lookml/advanced_measure_types.lkml and tests/adapters/lookml/test_advanced_measure_types.py cover each new measure type and timeframe, including end-to-end correctness of distinct de-duplication and timeframe extraction. All existing LookML tests remain green; backward compatibility for previously supported formats/fields is preserved.

Known limitations

  • Distinct measures emit <AGG>(DISTINCT expr), which de-duplicates by the aggregated expression itself; sql_distinct_key is preserved in meta for fidelity but is not woven into a hash-based dedup of the exact form Looker generates. This is correct for the common case where the aggregated value is unique per entity.
  • percent_of_total / percent_of_previous use unpartitioned OVER () window frames (column-wide), matching Looker's default table-calculation scope; query-time partitioning is not inferred.

Add support for several LookML measure types and dimension_group
timeframes that the adapter previously dropped or mishandled:

- Distinct aggregate measures (sum_distinct, average_distinct,
  median_distinct, percentile_distinct), emitted as derived measures
  with explicit DISTINCT aggregation and honoring sql_distinct_key
  (preserved in meta).
- Post-SQL / table-calculation measures: running_total maps to a
  cumulative metric over the base measure; percent_of_total and
  percent_of_previous map to derived window-function measures.
- Non-standard / fiscal dimension_group timeframes: fiscal_quarter,
  fiscal_year, fiscal_month_num, fiscal_quarter_of_year, day_of_week,
  day_of_week_index, month_name, month_num, week_of_year,
  quarter_of_year, hour_of_day, day_of_month, day_of_year and related,
  producing numeric/categorical dimensions with portable extraction SQL
  while truncation timeframes remain time dimensions.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f51863230d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/lookml.py
Comment thread sidemantic/adapters/lookml.py
Comment thread sidemantic/adapters/lookml.py Outdated
…cal timeframes

- sum_distinct/average_distinct with sql_distinct_key now dedupe by the key
  entity via a symmetric aggregate instead of SUM(DISTINCT value), so two
  distinct entities sharing a value are both counted (previously collapsed).
  Uses DECIMAL casts to avoid float overflow on the hash offset.
- percent_of_total / percent_of_previous qualify their base measure ref with
  {model} and wrap it in the base measure's own aggregate, so the generator
  resolves it to the base measure's _raw CTE column. Previously the emitted SQL
  referenced a bare, out-of-scope column and failed to compile.
- fiscal_quarter / fiscal_year honor fiscal_month_offset by shifting the
  timestamp before the calendar truncation, so non-calendar fiscal years bucket
  into correct fiscal periods instead of calendar boundaries.

Adds end-to-end regression tests that execute the compiled SQL against DuckDB.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: faa2b0dbe7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/lookml.py Outdated
Comment thread sidemantic/adapters/lookml.py Outdated
Comment thread sidemantic/adapters/lookml.py Outdated
…nct sums, count_distinct post-SQL base

- percentile_distinct now emits a parseable ordered-set PERCENTILE_CONT instead
  of the SQLGlot-rejected ORDER BY DISTINCT form, so the metric compiles and runs
- keyed sum_distinct/average_distinct bound the HASH offset (% 1<<61) so the
  DECIMAL symmetric aggregate no longer overflows past ~100 distinct keys
- percent_of_total/percent_of_previous over a count_distinct base measure now
  wrap the reference in COUNT(DISTINCT ...) via aggregate templates

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: df1915a0d6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/lookml.py
Comment thread sidemantic/adapters/lookml.py
Comment thread sidemantic/adapters/lookml.py
median_distinct and percentile_distinct with a sql_distinct_key emitted a
plain ordered-set aggregate over the fanned-out rows, so a value repeated
across joined rows skewed the quantile. Collapse (key, value) pairs to one
value per distinct key via LIST + LIST_DISTINCT, then take the continuous
quantile of that per-key list, matching Looker's fan-out-safe semantics.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 00321df7a4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/lookml.py
Comment thread sidemantic/adapters/lookml.py
Comment thread sidemantic/adapters/lookml.py
Comment thread sidemantic/adapters/lookml.py
@nicosuave nicosuave merged commit c3584c8 into main Jun 15, 2026
27 of 29 checks passed
@nicosuave nicosuave deleted the update-lookml-adapter branch June 15, 2026 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant