|
1 | | -# SNOMED CT Report Generator |
| 1 | +# Python Scripts |
2 | 2 |
|
3 | | -Python scripts to automatically generate analytical reports for SNOMED CT International Edition. |
| 3 | +This directory contains Python scripts and tools for the SCT Implementation Demonstrator project. |
4 | 4 |
|
5 | | -## π― Features |
| 5 | +## π Directory Structure |
6 | 6 |
|
7 | | -- **Automatic Download**: Fetches the latest SNOMED CT International version from the MLDS syndication feed |
8 | | - - π‘ The feed is public (no authentication required) |
9 | | - - π ZIP file download requires MLDS credentials |
10 | | -- **Comprehensive Analysis**: Generates 3 types of reports: |
11 | | - - Concept inactivations with historical reasons |
12 | | - - FSN (Fully Specified Name) changes |
13 | | - - New concepts by semantic tag |
14 | | -- **Interactive Visualizations**: HTML charts with Plotly |
15 | | -- **Excel Outputs**: Spreadsheets for detailed analysis |
16 | | - |
17 | | -## π Requirements |
18 | | - |
19 | | -- Python 3.11+ |
20 | | -- Account at [SNOMED International MLDS](https://mlds.ihtsdotools.org/) |
21 | | - |
22 | | -## π Installation |
23 | | - |
24 | | -1. Install dependencies: |
25 | | - |
26 | | -```bash |
27 | | -cd python |
28 | | -pip install -r requirements.txt |
29 | | -``` |
30 | | - |
31 | | -2. Configure MLDS credentials: |
32 | | - |
33 | | -Create a `.env` file in the `python/` directory: |
34 | | - |
35 | | -```bash |
36 | | -SNOMED_USER=your_email@example.com |
37 | | -SNOMED_PASSWORD=your_password |
38 | | -``` |
39 | | - |
40 | | -## π» Usage |
41 | | - |
42 | | -### Test connection (recommended first) |
43 | | - |
44 | | -```bash |
45 | | -python3 test_connection.py |
46 | 7 | ``` |
47 | | - |
48 | | -This script verifies that your MLDS credentials work correctly without downloading the complete files. |
49 | | - |
50 | | -### Generate all reports |
51 | | - |
52 | | -```bash |
53 | | -python3 run-reports.py |
54 | | -``` |
55 | | - |
56 | | -This script will automatically: |
57 | | - |
58 | | -1. β
Download the latest SNOMED CT International version |
59 | | -2. β
Locate required RF2 files |
60 | | -3. β
Generate 3 reports (Excel + HTML) |
61 | | -4. β
Save to `../src/assets/reports/` |
62 | | - |
63 | | -### Run individual scripts |
64 | | - |
65 | | -```bash |
66 | | -# Download latest version only |
67 | | -python3 syndication_downloader.py |
68 | | - |
69 | | -# Detect inactivations only |
70 | | -python3 detect_inactivations.py |
71 | | - |
72 | | -# Detect FSN changes only |
73 | | -python3 fsn_changes.py |
74 | | - |
75 | | -# Detect new concepts only |
76 | | -python3 new_concepts.py |
| 8 | +python/ |
| 9 | +βββ README.md # This file |
| 10 | +βββ reports-updater/ # SNOMED CT report generation automation |
| 11 | + βββ README.md # Full documentation |
| 12 | + βββ run-reports.py # Main script |
| 13 | + βββ requirements.txt # Dependencies |
| 14 | + βββ ... # Supporting scripts |
77 | 15 | ``` |
78 | 16 |
|
79 | | -## π Generated Reports |
80 | | - |
81 | | -Reports are saved to two locations: |
| 17 | +## π§ Current Tools |
82 | 18 |
|
83 | | -**Excel files** (`python/output/`): |
84 | | -- `detect-inactivations.xlsx` |
85 | | -- `fsn-changes.xlsx` |
86 | | -- `list-new-concepts.xlsx` |
| 19 | +### Reports Updater |
87 | 20 |
|
88 | | -**HTML files** (`src/assets/reports/`): |
89 | | -- `detect_inactivations_by_reason.html` |
90 | | -- `fsn_changes_with_details.html` |
91 | | -- `new_concepts_by_semantic_tag.html` |
| 21 | +Automated system for generating SNOMED CT analytical reports: |
| 22 | +- Downloads latest SNOMED CT International Edition |
| 23 | +- Generates Excel and HTML reports |
| 24 | +- Integrated with GitHub Actions for monthly updates |
92 | 25 |
|
93 | | -### Report Contents |
| 26 | +**Location**: `reports-updater/` |
| 27 | +**Documentation**: See [reports-updater/README.md](reports-updater/README.md) |
94 | 28 |
|
95 | | -1. **Inactivations** |
96 | | - - Concepts inactivated by reason (Duplicate, Outdated, etc.) |
97 | | - - Historical associations (SAME_AS, REPLACED_BY, etc.) |
| 29 | +## π Adding New Tools |
98 | 30 |
|
99 | | -2. **FSN Changes** |
100 | | - - Changes in Fully Specified Names |
101 | | - - Analysis by semantic tag |
102 | | - |
103 | | -3. **New Concepts** |
104 | | - - New concepts added |
105 | | - - Distribution by semantic tag |
106 | | - |
107 | | -## π€ GitHub Actions Automation |
108 | | - |
109 | | -The workflow `.github/workflows/generate-reports.yml` runs: |
110 | | - |
111 | | -- π
**Automatically**: Day 2 of each month at 3 AM UTC |
112 | | -- π **Manually**: Click "Run workflow" in GitHub Actions tab (anytime!) |
113 | | - |
114 | | -### Quick Setup: |
115 | | - |
116 | | -The workflow uses the GitHub Environment **`reports-updates`** with two secrets: |
117 | | -- `SNOMED_USER`: Your MLDS email |
118 | | -- `SNOMED_PASSWORD`: Your MLDS password |
119 | | - |
120 | | -**Setup**: |
121 | | -1. Go to **Settings** β **Environments** β **reports-updates** |
122 | | -2. Verify the secrets are configured (you already did this!) |
123 | | - |
124 | | -### Manual Execution: |
125 | | - |
126 | | -1. Go to **Actions** tab in GitHub |
127 | | -2. Select **"Generate SNOMED Reports"** |
128 | | -3. Click **"Run workflow"** button |
129 | | -4. Click the green **"Run workflow"** to confirm |
130 | | - |
131 | | -π **Detailed guide**: See [GITHUB_ACTIONS.md](GITHUB_ACTIONS.md) for step-by-step instructions with troubleshooting |
132 | | - |
133 | | -## ποΈ Architecture |
| 31 | +This structure allows for adding more Python scripts in the future: |
134 | 32 |
|
135 | 33 | ``` |
136 | 34 | python/ |
137 | | -βββ run-reports.py # Main orchestrator script |
138 | | -βββ syndication_downloader.py # Download from MLDS feed |
139 | | -βββ download_and_extract.py # Download and extraction with progress |
140 | | -βββ file_locator.py # Locates RF2 files |
141 | | -βββ detect_inactivations.py # Inactivation analysis |
142 | | -βββ detect_inactivations_graph_details.py |
143 | | -βββ fsn_changes.py # FSN changes analysis |
144 | | -βββ fsn_changes_graph_details.py |
145 | | -βββ new_concepts.py # New concepts analysis |
146 | | -βββ new_concepts_graph_details.py |
147 | | -βββ requirements.txt # Python dependencies |
148 | | -βββ README.md # This file |
149 | | -βββ SETUP.md # Setup guide |
150 | | -βββ GITHUB_ACTIONS.md # GitHub Actions guide (NEW!) |
151 | | -βββ CHANGELOG.md # Change history |
152 | | -βββ STATUS.md # Project status |
153 | | -βββ QUICKSTART.txt # Quick start guide |
154 | | -``` |
155 | | - |
156 | | -## π§ Advanced Configuration |
157 | | - |
158 | | -### Package Filters |
159 | | - |
160 | | -The script automatically filters by acceptable package types (similar to Snowstorm Lite Java client): |
161 | | - |
162 | | -```python |
163 | | -ACCEPTABLE_PACKAGE_TYPES = { |
164 | | - "SCT_RF2_SNAPSHOT", |
165 | | - "SCT_RF2_FULL", |
166 | | - "SCT_RF2_ALL" |
167 | | -} |
168 | | -``` |
169 | | - |
170 | | -### Graph Limits |
171 | | - |
172 | | -By default, HTML charts show the top 1500 entries. Adjust in `run-reports.py`: |
173 | | - |
174 | | -```python |
175 | | -generate_inactivation_report(xlsx, html, limit=1500) |
| 35 | +βββ reports-updater/ # Report generation |
| 36 | +βββ terminology-validator/ # Future: Validation tools |
| 37 | +βββ concept-browser/ # Future: Browse SNOMED concepts |
| 38 | +βββ mapping-tools/ # Future: Mapping utilities |
176 | 39 | ``` |
177 | 40 |
|
178 | | -## π Troubleshooting |
179 | | - |
180 | | -### Error: "cannot import name 'download_latest_international'" |
181 | | - |
182 | | -Ensure `syndication_downloader.py` is complete and has no syntax errors. |
183 | | - |
184 | | -### Error: "SNOMED_USER and SNOMED_PASSWORD must be set" |
185 | | - |
186 | | -Create the `.env` file with the correct credentials. |
187 | | - |
188 | | -### Error: "No International Edition found" |
189 | | - |
190 | | -Verify: |
191 | | -- Correct MLDS credentials |
192 | | -- Internet access |
193 | | -- Permissions on your MLDS account |
| 41 | +Each tool should: |
| 42 | +1. Have its own subdirectory |
| 43 | +2. Include a `README.md` with documentation |
| 44 | +3. Have its own `requirements.txt` if needed |
| 45 | +4. Be independent and self-contained |
194 | 46 |
|
195 | | -## π References |
| 47 | +## π Conventions |
196 | 48 |
|
197 | | -- [SNOMED International MLDS](https://mlds.ihtsdotools.org/) |
198 | | -- [Syndication Feed API](https://mlds.ihtsdotools.org/api/feed) |
199 | | -- [SNOMED CT RF2 Specification](https://confluence.ihtsdotools.org/display/DOCRELFMT/SNOMED+CT+Release+File+Specifications) |
| 49 | +- **Python version**: 3.11+ |
| 50 | +- **Code style**: Follow PEP 8 |
| 51 | +- **Documentation**: Include README.md in each subdirectory |
| 52 | +- **Dependencies**: Use `requirements.txt` per tool |
| 53 | +- **Environment variables**: Use `.env` files (Git-ignored) |
200 | 54 |
|
201 | | -## π License |
| 55 | +## π Links |
202 | 56 |
|
203 | | -See LICENSE.md in the project root directory. |
| 57 | +- [Reports Updater Documentation](reports-updater/README.md) |
| 58 | +- [GitHub Actions Workflow](../.github/workflows/generate-reports.yml) |
204 | 59 |
|
205 | | -## π₯ Contributions |
| 60 | +--- |
206 | 61 |
|
207 | | -Inspired by the [Snowstorm Lite Syndication Client](https://github.com/IHTSDO/snowstorm-lite) (Java). |
| 62 | +**Note**: Each subdirectory is self-contained with its own documentation, dependencies, and configuration. |
208 | 63 |
|
0 commit comments