fix: decouple visual styling from tbl_hierarchical_rate_by_grade to prevent Cartesian join explosion#228
fix: decouple visual styling from tbl_hierarchical_rate_by_grade to prevent Cartesian join explosion#228Melkiades wants to merge 62 commits into
Conversation
|
@jszczypinski sorry for adding you, I just asked for last committer ahah |
…revent Cartesian join explosion tbl_hierarchical_rate_by_grade() was blanking the label column for grade rows before returning, causing tbl_merge() inside tbl_with_pools() to lose row uniqueness and produce a Cartesian cross-join. Split into two functions: - tbl_hierarchical_rate_by_grade(): returns structurally pristine table with unique label values, injects custom_info metadata - add_grade_column(): post-processing function that applies all visual formatting (label_grade column, label blanking, headers, indentation) Co-authored-by: Ona <no-reply@ona.com>
…revent Cartesian join explosion tbl_hierarchical_rate_by_grade() was blanking the label column for grade rows before returning, causing tbl_merge() inside tbl_with_pools() to lose row uniqueness and produce a Cartesian cross-join. Split into two functions: - tbl_hierarchical_rate_by_grade(): returns structurally pristine table with unique label values, injects custom_info metadata - add_grade_column(): post-processing function that applies all visual formatting (label_grade column, label blanking, headers, indentation)
- add_grade_column() now returns early if label_grade already exists - metadata lookup iterates over x$tbls instead of hardcoding tbls[[1]] - added idempotency test (test #8) - fixed broken comment block in tbl_with_pools test Co-authored-by: Ona <no-reply@ona.com>
- add_grade_column() now returns early if label_grade already exists - metadata lookup iterates over x$tbls instead of hardcoding tbls[[1]] - added idempotency test (test #8) - fixed broken comment block in tbl_with_pools test
Co-authored-by: Ona <no-reply@ona.com>
1afacd6 to
8becdc8
Compare
|
@jszczypinski I did revert them ahah crazy stuff |
Melkiades
left a comment
There was a problem hiding this comment.
Self-review after fixes. All previously identified issues are resolved:
- Idempotency guard added (line 75): early return if
label_gradealready exists, prevents silent corruption on double-call. - Robust metadata lookup (line 78):
Find()iterates overx$tblsinstead of hardcodingtbls[[1]]. - Idempotency test added (test #8): verifies
table_bodyis identical after second call and grade labels are not corrupted. - Comment block fixed in
test-tbl_with_pools.R. - Docs updated to match implementation.
Remaining notes:
- Output snapshots for
tbl_hierarchical_rate_by_gradewere cleared and need regeneration on firstdevtools::test()run (testthat::snapshot_accept()). devtools::document()must be run to exportadd_grade_columnin NAMESPACE.
…elect deprecation Three tidyselect contexts in tbl_mmrm() were passing string variables (arm, visit) as bare names instead of using all_of(). This triggered deprecation warnings from tidyselect 1.1.0+. Co-authored-by: Ona <no-reply@ona.com>
…elect deprecation Three tidyselect contexts in tbl_mmrm() were passing string variables (arm, visit) as bare names instead of using all_of(). This triggered deprecation warnings from tidyselect 1.1.0+.
51ffb39 to
74c8419
Compare
…elect deprecation Three tidyselect contexts in tbl_mmrm() were passing string variables (arm, visit) as bare names instead of using all_of(). This triggered deprecation warnings from tidyselect 1.1.0+. Also added AGENTS.md to .gitignore. Co-authored-by: Ona <no-reply@ona.com>
…elect deprecation Three tidyselect contexts in tbl_mmrm() were passing string variables (arm, visit) as bare names instead of using all_of(). This triggered deprecation warnings from tidyselect 1.1.0+. Also added AGENTS.md to .gitignore.
803c081 to
76f4c0b
Compare
Unit Tests Summary 1 files 233 suites 3m 18s ⏱️ Results for commit 56339db. ♻️ This comment has been updated with latest results. |
Unit Test Performance Difference
Additional test case details
Results for commit fadcafb ♻️ This comment has been updated with latest results. |
Code Coverage SummaryDiff against mainResults for commit: 56339db Minimum allowed coverage is ♻️ This comment has been updated with latest results |
Tightly coupled function now shares the same file and help page via @Rdname. Removed standalone R/add_grade_column.R and its man page. Co-authored-by: Ona <no-reply@ona.com>
Tightly coupled function now shares the same file and help page via @Rdname. Removed standalone R/add_grade_column.R and its man page.
Co-authored-by: Ona <no-reply@ona.com>
…engineering/crane into js/fix-cartesian-join-explosion
…engineering/crane into js/fix-cartesian-join-explosion
Bare `label` inside modify_table_body() lambda is not visible to R CMD check. Use .data$label for all references.
Signed-off-by: Davide Garolini <dgarolini@gmail.com>
|
ready for review @shajoezhu @llrs-roche (Jan is out till monday I believe) |
a6da1a5 to
8c7f6de
Compare
Signed-off-by: Davide Garolini <dgarolini@gmail.com>
Signed-off-by: Davide Garolini <dgarolini@gmail.com>
llrs-roche
left a comment
There was a problem hiding this comment.
I have yet to check the PR locally or verified that the issue is indeed fixed but I have reviewed the code.
See comments below for some concerns
| #' @export | ||
| add_grade_column <- function(x) { |
There was a problem hiding this comment.
Why is added with tbl_hierarchical_rate_by_grade() if it must be called also after any merging?
There was a problem hiding this comment.
Fair question! They share the help page because add_grade_column() only works on output from tbl_hierarchical_rate_by_grade() — it reads the custom_info metadata that function stores. So they are tightly coupled, just called at different points in the pipeline. Added a paragraph to the docs explaining this relationship.
There was a problem hiding this comment.
If this graph describes the situation I think it should be a stand alone function:
flowchart LR
%% Data Nodes and Functions
A["1. Build Table<br>(tbl_hierarchical_rate_by_grade())"]
B["2. Merge Tables (Optional)<br>(tbl_merge() / tbl_with_pools())"]
C["3. Finalize Table<br>(add_grade_column())"]
%% Data Flow
A --> B
B --> C
A -->C
In any case please add a link/@seealso from tbl_with_pools() to the help page of add_grade_column().
|
Do we need a special version of cards or anything? devtools::test(filter = 'add_grade_column')
[ FAIL 7 | WARN 0 | SKIP 0 | PASS 2 ] |
Fixes #226
Description
tbl_hierarchical_rate_by_grade()was blanking thelabelcolumn for grade rows (label = "") before returning. Whentbl_with_pools()calledtbl_merge(), the join on(variable, row_type, var_label, label)lost row uniqueness — every blank-label grade row matched every other, producing a Cartesian cross-join with massive row duplication and column suffixing.Fix: Decouple data generation from visual presentation:
tbl_hierarchical_rate_by_grade()now returns a structurally pristine table wherelabelretains unique grade text. Injectscustom_infometadata for downstream use.add_grade_column()(new exported function) applies all visual formatting (label_gradecolumn, label blanking, headers, indentation) as a post-processing step.New workflow
How to test
Run
devtools::document()thendevtools::test(). Accept new snapshots withtestthat::snapshot_accept().Files changed
R/tbl_hierarchical_rate_by_grade.Rcustom_infometadata injectionR/add_grade_column.Rtests/testthat/test-add_grade_column.Rtests/testthat/test-tbl_hierarchical_rate_by_grade.Radd_grade_column()tests/testthat/test-tbl_with_pools.R