Commit c2aee09
Add bias-audit module: cohort-provenance + method-ceiling diagnostics
Implements six new exports for circulating-miRNA biomarker QC:
- os_bias_floor_auc: dataset-identity-only classifier AUC
- os_covariate_only_auc: single-covariate classifier AUC (age, sex, storage)
- os_per_feature_batch_signal: healthy-only ANOVA for batch confounding
- os_clustered_bootstrap_auc: clustered CI respecting specimen duplicates
- os_detect_cross_cohort_duplicates: cross-GSE OV-code crosswalk
- os_bias_audit: top-level orchestrator with print.os_bias_audit method
All six functions are generic enough for any omics dataset and do not
depend on miRNA specifics. 14 testthat cases cover asymmetric cohort,
age-confound, feature-batch-signal ranking, clustered-vs-naive bootstrap
width, cross-cohort dup detection, and orchestrator integration.
Motivated by the 2026-04-18 miRPOC OC meta-analysis, which showed that
a cohort-identity classifier reaches AUC 0.72 on four public 3D-Gene
consortium cohorts before any biomarker feature is used, and that
matching cases and controls on institution and storage collapses the
apparent 14-panel AUC from 0.95 to 0.74.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 90391d3 commit c2aee09
11 files changed
Lines changed: 1222 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| |||
130 | 131 | | |
131 | 132 | | |
132 | 133 | | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
133 | 140 | | |
134 | 141 | | |
135 | 142 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
1 | 6 | | |
2 | 7 | | |
3 | 8 | | |
| |||
0 commit comments