Commit 7446bae
committed
Phase 1: OmicSelector 2.0 Modernization - Core Architecture
This commit implements the foundational modernization of OmicSelector,
transforming it into a 2025-compliant biomarker discovery platform.
Major Changes:
============
1. Modern ML Framework (framework_modern.R)
- Tidymodels integration with automatic backend detection
- Unified interface for tidymodels and caret workflows
- Smart routing between frameworks based on algorithm availability
- Comprehensive preprocessing pipeline with recipes
- Support for ranger, xgboost, glmnet, SVM, and more
2. Nested Cross-Validation (nested_cv.R)
- Gold-standard nested CV implementation
- Rigorous prevention of data leakage by design
- All preprocessing and feature selection inside folds
- Outer loop: unbiased performance estimation
- Inner loop: hyperparameter tuning and model selection
- Feature stability analysis with Jaccard similarity
- Support for multiple feature selection methods:
* Boruta
* Recursive Feature Elimination (RFE)
* LASSO
* Stability Selection
- Automatic calibration assessment
3. TRIPOD+AI & PROBAST+AI Compliance (compliance.R)
- Automated TRIPOD+AI report generation (27 checklist items)
- PROBAST+AI risk of bias assessment (4 domains)
- Export to HTML, PDF, JSON, or Markdown
- Automated recommendations for improvement
- Ensures transparent reporting for clinical prediction models
4. Modern Dependencies & Infrastructure
- Updated DESCRIPTION to version 2.0.0
- Added tidymodels ecosystem (parsnip, recipes, tune, workflows, yardstick)
- Modern ML backends (ranger, xgboost, glmnet)
- Maintained backward compatibility with caret
- Added comprehensive Suggests for future phases
5. CI/CD & Testing
- GitHub Actions workflows:
* R-CMD-check (Ubuntu, macOS, Windows)
* Test coverage with covr
* Pkgdown documentation deployment
- Comprehensive test suite using testthat 3.0
- Tests for all new components
- ~30 test cases covering core functionality
6. Documentation
- MODERNIZATION.md: Comprehensive modernization guide
- modern_workflow.Rmd: Tutorial vignette
- Migration guide for existing users
- Examples of data leakage prevention
- Best practices and troubleshooting
Key Features:
=============
✅ Data Leakage Prevention
- Preprocessing inside resampling folds
- Feature selection nested properly
- Hyperparameter tuning in inner loop only
✅ Feature Selection Stability
- Multiple methods (Boruta, RFE, LASSO, Stability Selection)
- Jaccard similarity between folds
- Nogueira stability metrics
✅ Clinical Reporting Standards
- TRIPOD+AI compliant reports
- PROBAST+AI risk assessment
- Automated checklist completion
✅ Backward Compatibility
- All existing functions still work
- Caret workflows unchanged
- Smooth migration path
Backward Compatibility:
======================
All existing OmicSelector 1.0 functions remain fully functional:
- OmicSelector_benchmark()
- OmicSelector_OmicSelector()
- OmicSelector_heatmap()
- OmicSelector_PCA()
- And 30+ other functions
Future Phases:
=============
Phase 2: Advanced feature selection (stability metrics, knockoffs)
Phase 3: Multi-omics integration (DIABLO, MOFA+)
Phase 4: Clinical utility metrics (DCA, calibration)
Phase 5: Survival analysis (Cox, RSF, time-dependent)
Phase 6: Data connectors (GEO, TCGA, dataset cards)
Phase 7: Modern ML algorithms (AutoML, LightGBM, CatBoost)
Phase 8: Explainability (SHAP, LIME, ICE/PDP)
Breaking Changes:
================
None - this is a fully backward-compatible release.
Migration:
=========
Existing code works without changes. New features available via:
- OmicSelector_fit() for single model training
- OmicSelector_nested_cv() for rigorous validation
- OmicSelector_tripod_report() for compliance reporting
- OmicSelector_probast() for bias assessment
References:
==========
- TRIPOD+AI: Collins et al. (2024) BMJ
- PROBAST: Wolff et al. (2019) Ann Intern Med
- Nested CV: Varma & Simon (2006) BMC Bioinformatics
- Feature Stability: Nogueira & Brown (2016) Mach Learn
Closes: Phase 1 of modernization plan
Related: OmicSelector 2.0 Roadmap1 parent a2058d6 commit 7446bae
13 files changed
Lines changed: 3471 additions & 31 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
| 11 | + | |
12 | 12 | | |
13 | | - | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
14 | 31 | | |
15 | 32 | | |
16 | | - | |
17 | | - | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
18 | 67 | | |
19 | 68 | | |
20 | 69 | | |
21 | | - | |
22 | 70 | | |
23 | 71 | | |
24 | | - | |
25 | | - | |
26 | 72 | | |
27 | 73 | | |
28 | 74 | | |
29 | 75 | | |
30 | | - | |
31 | | - | |
32 | 76 | | |
33 | | - | |
34 | | - | |
35 | 77 | | |
36 | 78 | | |
37 | 79 | | |
38 | 80 | | |
39 | | - | |
40 | 81 | | |
41 | | - | |
42 | | - | |
43 | 82 | | |
44 | 83 | | |
45 | 84 | | |
46 | 85 | | |
47 | 86 | | |
48 | | - | |
49 | 87 | | |
50 | 88 | | |
51 | 89 | | |
52 | 90 | | |
53 | 91 | | |
54 | | - | |
55 | 92 | | |
56 | 93 | | |
57 | 94 | | |
58 | 95 | | |
59 | 96 | | |
60 | | - | |
61 | 97 | | |
62 | 98 | | |
63 | 99 | | |
64 | 100 | | |
65 | 101 | | |
66 | | - | |
67 | 102 | | |
68 | 103 | | |
69 | 104 | | |
70 | | - | |
71 | | - | |
72 | 105 | | |
73 | 106 | | |
74 | 107 | | |
75 | | - | |
76 | | - | |
77 | 108 | | |
78 | 109 | | |
79 | 110 | | |
| |||
83 | 114 | | |
84 | 115 | | |
85 | 116 | | |
86 | | - | |
87 | 117 | | |
88 | 118 | | |
89 | | - | |
90 | 119 | | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | 120 | | |
95 | 121 | | |
96 | | - | |
97 | 122 | | |
98 | 123 | | |
99 | | - | |
| 124 | + | |
| 125 | + | |
100 | 126 | | |
101 | 127 | | |
102 | 128 | | |
| |||
0 commit comments