Screens for pathogenic repeat expansions — a class of mutations invisible to both DeepVariant and Manta. In these diseases, a short DNA sequence (3-6 bases) gets repeated too many times.
STR expansions cause ~40 known neurological/neuromuscular diseases including Huntington's, Fragile X, Friedreich's ataxia, ALS/FTD, myotonic dystrophy, and multiple spinocerebellar ataxias.
- ExpansionHunter v5.0.0 (Illumina) — upgraded from v2.5.5. Adds multithreading, improved long-repeat estimation, and a bundled GRCh38 variant catalog (31 pathogenic loci)
quay.io/biocontainers/expansionhunter:5.0.0--hc26b3af_5
- Binary:
ExpansionHunter(on PATH) - GRCh38 catalog:
/usr/local/share/ExpansionHunter/variant_catalog/grch38/variant_catalog.json(31 pathogenic loci)
| Disease | Gene | Repeat Unit | Normal | Pathogenic |
|---|---|---|---|---|
| Huntington's | HTT | CAG | <27 | >35 |
| Fragile X | FMR1 | CGG | <45 | ≥55 (premutation) / >200 (full) |
| Friedreich's Ataxia | FXN | GAA | <33 | >66 |
| ALS/FTD | C9ORF72 | GGCCCC | <24 | >30 |
| Myotonic Dystrophy 1 | DMPK | CTG | <35 | >50 |
| SCA1 | ATXN1 | CAG | <33 | >39 |
| SCA2 | ATXN2 | CAG | <22 | >33 |
FMR1 (Fragile X) has four distinct clinical zones — the intermediate zone (45-54 repeats) is often omitted but clinically relevant:
| Zone | Repeats | Clinical Significance |
|---|---|---|
| Normal | <45 | No risk |
| Intermediate (gray zone) | 45-54 | Not affected, but repeats may expand in offspring. Genetic counseling recommended for carriers. |
| Premutation | 55-200 | Risk of FXTAS (tremor/ataxia, males >50), FXPOI (premature ovarian insufficiency). Offspring at risk of full expansion. |
| Full mutation | >200 | Fragile X syndrome (intellectual disability, behavioral features). Penetrance varies by sex and methylation. |
./scripts/09-expansion-hunter.sh your_name male
# or: ./scripts/09-expansion-hunter.sh your_name femaleThe second argument (male/female) is required — it affects X-linked loci (FMR1, AR): males have one allele, females have two.
- Uses ExpansionHunter v5.0.0 (
quay.io/biocontainers/expansionhunter:5.0.0--hc26b3af_5) - v5 CLI:
--reads,--reference,--variant-catalog(JSON file),--output-prefix(auto-generates .vcf, .json) - The 31-locus GRCh38 variant catalog is bundled inside the container at
/usr/local/share/ExpansionHunter/variant_catalog/grch38/variant_catalog.json - Short-read WGS can reliably detect expansions up to ~150 repeats; very large expansions (>1000) are less accurate