Skip to content

supzammy/EpiRNA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Screenshot 2026-06-09 at 11 04 52 PM ## EpiRNA

EpiRNA-Scanner - old (Dont use it)

Rapid, Length-Agnostic Mapping of m6A Motifs at Single-Nucleotide Resolution

DOI Hugging Face Spaces Python 3.8+ License: MIT

EpiRNA-Scanner is an in-silico triage engine designed to rapidly map N6-methyladenosine (m6A) regulatory sites across entire transcriptomes. By leveraging sequence-derived biophysical embeddings and a dilated 1D Convolutional Neural Network (CNN) backbone, it bypasses the strict length limitations of traditional fixed-window models.

The framework features a Global-Local Hybrid Variance Stabilizer to prevent division-by-zero inflation in low-entropy homopolymer regions, and Epitranscriptomic Boundary Contrast Scoring (EBCS) to identify catalytic target coordinates with single-nucleotide spatial precision.

Live Demo

You can run EpiRNA-Scanner directly in your browser with zero local installation or disk storage required via our deployed web interface: 👉 EpiRNA-Scanner on Hugging Face


Note on Model Weights: The pre-trained model weights (~300 MB) exceed GitHub's file size limits. Before running the engine locally, please download the epirna_model.h5 file from our Zenodo Repository and place it in the root directory.

✨ Key Features

  • Length-Agnostic Architecture: Process raw FASTA sequences ranging from 41 bp to >100,000 bp without manual tiling or tensor shape mismatch errors. (Benchmarked 100kb+ processing in ~110 seconds).
  • Single-Nucleotide Resolution: Extracts exact topological coordinates for the target Adenine within canonical DRACH motifs.
  • Homopolymer Stability: Maintains a stable 0.0 baseline contrast in dense Poly-A/GC regions, eliminating false positives caused by local variance collapse.
  • Zero-Disk NCBI Streaming: Fetch and score transcripts directly from NCBI Entrez accession numbers without downloading genome files to your local drive.

Installation

To run EpiRNA-Scanner locally for reproducibility testing:

  1. Clone the repository or extract the Zenodo archive:
    git clone [https://github.com/supzammy/EpiRNA](https://github.com/supzammy/EpiRNA.git)
    cd EpiRNA-Scanner
    
    
  2. ** Create a virtual environment (recommended):
    python -m venv epirna_env
    source epirna_env/bin/activate  # On Windows use: epirna_env\Scripts\activate
    
    
  3. ** Install dependencies:
    pip install -r requirements.txt
    
    
  4. ** Reproducibility & Testing This repository includes the validation datasets used to benchmark the tool's biological gating and spatial resolution.

I. The U135C Mutation Stress-Test Test the system's sensitivity to canonical sequence motif disruption by comparing a wild-type transcript against a point-mutated control. ```bash

 python epirna_engine.py --fasta test_data/MYC_wildtype.fasta
 python epirna_engine.py --fasta test_data/MYC_mutant_U135C.fasta

II. The Homopolymer Variance Test Verify the Global-Local Hybrid Variance Stabilizer against a simulated sequence designed to trigger division-by-zero errors in standard algorithms.

 python epirna_engine.py --fasta test_data/TEST_01_HOMOPOLYMER_STABILITY.fasta

III. Direct Sequence Input Run the tool directly from the command line using a raw sequence string: ```bash

 python epirna_engine.py --sequence "UCCGGCUCCGCUUCGGCGGACUCCGGCUUCGGC"

Architecture Pipeline (v1.0)

  • Biophysical Embedding: Input sequences are initialized into a numerical tensor based on hydrogen-bond potential, aromatic stacking energy, and solvent accessibility.

  • Dilated CNN Framework: Evaluates spatial homology for degenerate DRACH constraints.

  • EBCS Derivative Scoring: Spatial derivatives compute a normalized contrast boundary, anchoring the prediction to the catalytic Adenine.

  • Variance Stabilization: Modulates local scoring denominators against the global transcript entropy to suppress background noise.

Note: Future iterations will integrate thermodynamic Minimum Free Energy (MFE) calculations via algorithms such as RNAfold to structurally penalize thermodynamically inaccessible motifs.

Citation

** If you use EpiRNA-Scanner in your research, please cite our corresponding manuscript and this Zenodo repository:

** Zaeem Ahmad Mansoori, et al. (2026). "EpiRNA-Scanner: Rapid, Length-Agnostic Mapping of m6A Motifs at Single-Nucleotide Resolution." [Target Journal/Conference Name]. DOI: 10.5281/zenodo.20615778

License

This project is licensed under the MIT License - see the MIT LICENSE file for details.

About

EpiRNA-Scanner, a publicly deployed web framework that processes sequences from 41 bases to over 100,000 bases without manual preprocessing. The system introduces Epitranscriptomic Boundary Contrast Scoring (EBCS), producing single-nucleotide resolution predictions anchored to degenerate DRACH motifs.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages