Hazards become disasters when buffering capacity collapses under accumulated pressure. This project models disaster escalation as a transition, not as a one-off event.
Instead of predicting “damage” directly, the engine answers a more actionable question:
Is this event’s system equilibrium becoming unstable and which forces are driving it?
A diagnostic + early-warning engine that:
-
computes an interpretable Instability Index (leading signal)
-
decomposes each event into pressure forces vs buffer forces
-
trains an ML model to estimate Major Disaster Escalation Risk (P(major))
-
provides a Streamlit UI for:
- single-event diagnosis
- counterfactual scenario testing
- cohort / map pressure-field visualization
- Not a “Kaggle damage prediction model”
- Not a black-box catastrophe forecaster
- Not an automated decision maker
This is a human-centered decision-support system.
Kaggle: Disaster Events 2025 by emirhanakku https://www.kaggle.com/datasets/emirhanakku/disaster-events-2025
Note: The dataset is synthetic/constructed for 2025-style disaster analysis. This project focuses on mechanistic interpretation and counterfactual stress testing, not historical truth claims.
A “disaster” is not the earthquake, flood, or wildfire by itself.
A disaster is the moment when:
- pressure increases
- buffers fail
- systems cannot recover
- outcomes (loss, displacement, casualties) materialize
So we model disaster escalation as:
- Hazard pressure (severity / intensity)
- Exposure pressure (affected population, density)
- Response latency pressure (slower response amplifies damage)
- Infrastructure fragility pressure (damage index indicates weak structure)
- Aid/response capacity and the system’s ability to absorb shocks
This creates an instability lens:
Instability rises before outcomes fully appear. That’s why it’s a leading signal.
For every event, the system computes:
A continuous score: higher = more fragile equilibrium.
Quantile-based zones (dataset-relative):
- 🟢 Stable
- 🟡 Fragile
- 🟠 Unstable
- 🔴 Critical
A bar decomposition showing:
- negative bars → pressures pulling the system toward collapse
- positive bars → buffering forces resisting collapse
A trained classifier estimates:
- Probability of major escalation
- Used as a secondary signal to confirm or contest the force-based reading
Important: ML is not replacing the force model. ML is an additional layer that learns nonlinear interactions.
-
Load raw CSV
-
Clean types + normalize key features
-
Compute engineered “force signals”
-
Save processed dataset (cache)
-
Train ML model on “major disaster” target proxy
-
App loads:
- processed data
- trained model
- produces diagnostics and simulations
Because:
- force model = explainable mechanism
- ML model = pattern learner
- together = interpretable + adaptive
Run:
streamlit run app/app.pyTabs:
- Event Diagnostic
- Scenario Simulator
- Map / Cohort View
Each tab is a different “lens” on the same underlying model.
This view explains why a single event escalates by decomposing it into pressures and buffers.
You choose one event row from the dataset.
This keeps the system grounded in real records:
- event type
- country/region
- date
- zone
This number answers:
How fragile is the event’s equilibrium right now?
It is not “damage.” It is a leading stress indicator.
This converts the continuous instability into human-friendly interpretation:
- Stable: buffers dominate
- Fragile: stress rising, buffers still holding
- Unstable: competing pressures, recovery weak
- Critical: collapse likely under small additional shocks
This represents stabilizing strength. In screenshot it is high (~0.980), which explains why the event can be “Stable” even if hazards exist.
This is shown as context only:
- it’s an outcome
- it’s lagging
- it’s not the decision signal
This indicates the system has loaded the trained model successfully and can compute P(major).
The probability that this event belongs to the “major escalation” regime.
You’ll notice in screenshot it shows 1.000, that suggests the model is extremely confident for that data region. (If you later want, we can calibrate probability output or adjust class definition.)
This is the heart of the framework.
- Each bar represents a force.
- Direction indicates whether it destabilizes or stabilizes.
- Magnitude shows leverage.
Interpretation rule:
If the negative pressures dominate and buffer is weak → instability rises.
This makes the diagnostic view explainable by design.
This simulator tests what-if interventions on the same event: faster response, aid delivery, reduced exposure, etc.
Most systems predict outcomes after interventions.
This simulator evaluates interventions directly by asking:
Which lever reduces instability the most?
Selects which event you are stress-testing.
Simulates escalation in hazard intensity.
This tests:
- how sensitive the system is to stronger shocks
This is a critical lever.
In disaster systems, response time often behaves like a nonlinear amplifier:
- small delays → large consequences
This is a discrete buffer toggle:
- Keep / Increase / Decrease (depending on app options)
This is where you test buffer collapse vs reinforcement.
A proxy for exposure magnitude.
A structural fragility adjustment.
Shows how intervention shifts equilibrium.
The key number for decision-making.
If Δ is negative:
- intervention improves stability If positive:
- scenario makes the system more fragile
This shows whether the event crosses a threshold into a worse regime.
This measures how ML “agrees” with the scenario change.
Even if instability shifts slightly, ML may remain saturated (e.g., 1.0). That’s not a bug, it means ML sees the event still in the same learned regime.
This view treats all events as a pressure field over geography.
The diagnostic view explains one event.
This view explains the system shape:
- clusters
- hotspots
- fragility regimes
- geographic concentration
Each dot = one event’s geo location.
Color indicates the event’s early-warning zone.
This makes hotspots visible:
- concentration of Critical/Unstable
- stable regions with occasional spikes
Size is proportional to instability magnitude.
So the map encodes two signals:
- categorical (zone)
- continuous (instability)
You can filter by:
- disaster type(s)
- warning zone(s)
- max points plotted (performance)
This is not just visual, it’s analytic: you can isolate, for example:
- only floods
- only critical
- only one region cluster
Prepare processed dataset:
python -m src.cli prepare-dataTrain ML model:
python -m src.cli trainMost disaster analytics produce outputs like:
- top affected regions
- predicted losses
- ranked countries
This system produces:
- instability fields
- force decomposition
- counterfactual intervention leverage
- interpretable regime shifts
It’s action-first, not leaderboard-first.
This tool should be used for:
- research and education
- prototyping decision-support concepts
- studying instability mechanics
It should not be used for:
- automated emergency response decisions
- high-stakes policy without validation
- real-world forecasting claims (dataset is synthetic)