Skip to content

Commit 2a8fc61

Browse files
committed
Better structure in python folder
1 parent 7e6c108 commit 2a8fc61

26 files changed

Lines changed: 255 additions & 192 deletions

β€Ž.github/workflows/generate-reports.ymlβ€Ž

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,10 +26,10 @@ jobs:
2626

2727
- name: Install dependencies
2828
run: |
29-
pip install -r python/requirements.txt
29+
pip install -r python/reports-updater/requirements.txt
3030
3131
- name: Run SNOMED Report Generator
32-
working-directory: ./python
32+
working-directory: ./python/reports-updater
3333
env:
3434
SNOMED_USER: ${{ secrets.SNOMED_USER }}
3535
SNOMED_PASSWORD: ${{ secrets.SNOMED_PASSWORD }}

β€Ž.gitignoreβ€Ž

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ testem.log
4242
Thumbs.db
4343

4444
# Python environment and data
45-
python/.env
46-
python/data/
47-
python/output/
48-
python/__pycache__/
45+
python/reports-updater/.env
46+
python/reports-updater/data/
47+
python/reports-updater/output/
48+
python/**/__pycache__/

β€Žpython/README.mdβ€Ž

Lines changed: 40 additions & 185 deletions
Original file line numberDiff line numberDiff line change
@@ -1,208 +1,63 @@
1-
# SNOMED CT Report Generator
1+
# Python Scripts
22

3-
Python scripts to automatically generate analytical reports for SNOMED CT International Edition.
3+
This directory contains Python scripts and tools for the SCT Implementation Demonstrator project.
44

5-
## 🎯 Features
5+
## πŸ“‚ Directory Structure
66

7-
- **Automatic Download**: Fetches the latest SNOMED CT International version from the MLDS syndication feed
8-
- πŸ“‘ The feed is public (no authentication required)
9-
- πŸ” ZIP file download requires MLDS credentials
10-
- **Comprehensive Analysis**: Generates 3 types of reports:
11-
- Concept inactivations with historical reasons
12-
- FSN (Fully Specified Name) changes
13-
- New concepts by semantic tag
14-
- **Interactive Visualizations**: HTML charts with Plotly
15-
- **Excel Outputs**: Spreadsheets for detailed analysis
16-
17-
## πŸ“‹ Requirements
18-
19-
- Python 3.11+
20-
- Account at [SNOMED International MLDS](https://mlds.ihtsdotools.org/)
21-
22-
## πŸš€ Installation
23-
24-
1. Install dependencies:
25-
26-
```bash
27-
cd python
28-
pip install -r requirements.txt
29-
```
30-
31-
2. Configure MLDS credentials:
32-
33-
Create a `.env` file in the `python/` directory:
34-
35-
```bash
36-
SNOMED_USER=your_email@example.com
37-
SNOMED_PASSWORD=your_password
38-
```
39-
40-
## πŸ’» Usage
41-
42-
### Test connection (recommended first)
43-
44-
```bash
45-
python3 test_connection.py
467
```
47-
48-
This script verifies that your MLDS credentials work correctly without downloading the complete files.
49-
50-
### Generate all reports
51-
52-
```bash
53-
python3 run-reports.py
54-
```
55-
56-
This script will automatically:
57-
58-
1. βœ… Download the latest SNOMED CT International version
59-
2. βœ… Locate required RF2 files
60-
3. βœ… Generate 3 reports (Excel + HTML)
61-
4. βœ… Save to `../src/assets/reports/`
62-
63-
### Run individual scripts
64-
65-
```bash
66-
# Download latest version only
67-
python3 syndication_downloader.py
68-
69-
# Detect inactivations only
70-
python3 detect_inactivations.py
71-
72-
# Detect FSN changes only
73-
python3 fsn_changes.py
74-
75-
# Detect new concepts only
76-
python3 new_concepts.py
8+
python/
9+
β”œβ”€β”€ README.md # This file
10+
└── reports-updater/ # SNOMED CT report generation automation
11+
β”œβ”€β”€ README.md # Full documentation
12+
β”œβ”€β”€ run-reports.py # Main script
13+
β”œβ”€β”€ requirements.txt # Dependencies
14+
└── ... # Supporting scripts
7715
```
7816

79-
## πŸ“Š Generated Reports
80-
81-
Reports are saved to two locations:
17+
## πŸ”§ Current Tools
8218

83-
**Excel files** (`python/output/`):
84-
- `detect-inactivations.xlsx`
85-
- `fsn-changes.xlsx`
86-
- `list-new-concepts.xlsx`
19+
### Reports Updater
8720

88-
**HTML files** (`src/assets/reports/`):
89-
- `detect_inactivations_by_reason.html`
90-
- `fsn_changes_with_details.html`
91-
- `new_concepts_by_semantic_tag.html`
21+
Automated system for generating SNOMED CT analytical reports:
22+
- Downloads latest SNOMED CT International Edition
23+
- Generates Excel and HTML reports
24+
- Integrated with GitHub Actions for monthly updates
9225

93-
### Report Contents
26+
**Location**: `reports-updater/`
27+
**Documentation**: See [reports-updater/README.md](reports-updater/README.md)
9428

95-
1. **Inactivations**
96-
- Concepts inactivated by reason (Duplicate, Outdated, etc.)
97-
- Historical associations (SAME_AS, REPLACED_BY, etc.)
29+
## πŸš€ Adding New Tools
9830

99-
2. **FSN Changes**
100-
- Changes in Fully Specified Names
101-
- Analysis by semantic tag
102-
103-
3. **New Concepts**
104-
- New concepts added
105-
- Distribution by semantic tag
106-
107-
## πŸ€– GitHub Actions Automation
108-
109-
The workflow `.github/workflows/generate-reports.yml` runs:
110-
111-
- πŸ“… **Automatically**: Day 2 of each month at 3 AM UTC
112-
- πŸ”„ **Manually**: Click "Run workflow" in GitHub Actions tab (anytime!)
113-
114-
### Quick Setup:
115-
116-
The workflow uses the GitHub Environment **`reports-updates`** with two secrets:
117-
- `SNOMED_USER`: Your MLDS email
118-
- `SNOMED_PASSWORD`: Your MLDS password
119-
120-
**Setup**:
121-
1. Go to **Settings** β†’ **Environments** β†’ **reports-updates**
122-
2. Verify the secrets are configured (you already did this!)
123-
124-
### Manual Execution:
125-
126-
1. Go to **Actions** tab in GitHub
127-
2. Select **"Generate SNOMED Reports"**
128-
3. Click **"Run workflow"** button
129-
4. Click the green **"Run workflow"** to confirm
130-
131-
πŸ“– **Detailed guide**: See [GITHUB_ACTIONS.md](GITHUB_ACTIONS.md) for step-by-step instructions with troubleshooting
132-
133-
## πŸ—οΈ Architecture
31+
This structure allows for adding more Python scripts in the future:
13432

13533
```
13634
python/
137-
β”œβ”€β”€ run-reports.py # Main orchestrator script
138-
β”œβ”€β”€ syndication_downloader.py # Download from MLDS feed
139-
β”œβ”€β”€ download_and_extract.py # Download and extraction with progress
140-
β”œβ”€β”€ file_locator.py # Locates RF2 files
141-
β”œβ”€β”€ detect_inactivations.py # Inactivation analysis
142-
β”œβ”€β”€ detect_inactivations_graph_details.py
143-
β”œβ”€β”€ fsn_changes.py # FSN changes analysis
144-
β”œβ”€β”€ fsn_changes_graph_details.py
145-
β”œβ”€β”€ new_concepts.py # New concepts analysis
146-
β”œβ”€β”€ new_concepts_graph_details.py
147-
β”œβ”€β”€ requirements.txt # Python dependencies
148-
β”œβ”€β”€ README.md # This file
149-
β”œβ”€β”€ SETUP.md # Setup guide
150-
β”œβ”€β”€ GITHUB_ACTIONS.md # GitHub Actions guide (NEW!)
151-
β”œβ”€β”€ CHANGELOG.md # Change history
152-
β”œβ”€β”€ STATUS.md # Project status
153-
└── QUICKSTART.txt # Quick start guide
154-
```
155-
156-
## πŸ”§ Advanced Configuration
157-
158-
### Package Filters
159-
160-
The script automatically filters by acceptable package types (similar to Snowstorm Lite Java client):
161-
162-
```python
163-
ACCEPTABLE_PACKAGE_TYPES = {
164-
"SCT_RF2_SNAPSHOT",
165-
"SCT_RF2_FULL",
166-
"SCT_RF2_ALL"
167-
}
168-
```
169-
170-
### Graph Limits
171-
172-
By default, HTML charts show the top 1500 entries. Adjust in `run-reports.py`:
173-
174-
```python
175-
generate_inactivation_report(xlsx, html, limit=1500)
35+
β”œβ”€β”€ reports-updater/ # Report generation
36+
β”œβ”€β”€ terminology-validator/ # Future: Validation tools
37+
β”œβ”€β”€ concept-browser/ # Future: Browse SNOMED concepts
38+
└── mapping-tools/ # Future: Mapping utilities
17639
```
17740

178-
## πŸ› Troubleshooting
179-
180-
### Error: "cannot import name 'download_latest_international'"
181-
182-
Ensure `syndication_downloader.py` is complete and has no syntax errors.
183-
184-
### Error: "SNOMED_USER and SNOMED_PASSWORD must be set"
185-
186-
Create the `.env` file with the correct credentials.
187-
188-
### Error: "No International Edition found"
189-
190-
Verify:
191-
- Correct MLDS credentials
192-
- Internet access
193-
- Permissions on your MLDS account
41+
Each tool should:
42+
1. Have its own subdirectory
43+
2. Include a `README.md` with documentation
44+
3. Have its own `requirements.txt` if needed
45+
4. Be independent and self-contained
19446

195-
## πŸ“š References
47+
## πŸ“ Conventions
19648

197-
- [SNOMED International MLDS](https://mlds.ihtsdotools.org/)
198-
- [Syndication Feed API](https://mlds.ihtsdotools.org/api/feed)
199-
- [SNOMED CT RF2 Specification](https://confluence.ihtsdotools.org/display/DOCRELFMT/SNOMED+CT+Release+File+Specifications)
49+
- **Python version**: 3.11+
50+
- **Code style**: Follow PEP 8
51+
- **Documentation**: Include README.md in each subdirectory
52+
- **Dependencies**: Use `requirements.txt` per tool
53+
- **Environment variables**: Use `.env` files (Git-ignored)
20054

201-
## πŸ“ License
55+
## πŸ”— Links
20256

203-
See LICENSE.md in the project root directory.
57+
- [Reports Updater Documentation](reports-updater/README.md)
58+
- [GitHub Actions Workflow](../.github/workflows/generate-reports.yml)
20459

205-
## πŸ‘₯ Contributions
60+
---
20661

207-
Inspired by the [Snowstorm Lite Syndication Client](https://github.com/IHTSDO/snowstorm-lite) (Java).
62+
**Note**: Each subdirectory is self-contained with its own documentation, dependencies, and configuration.
20863

File renamed without changes.
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
Β (0)