Movement Analysis CLI

A command-line pipeline for GPS telemetry movement ecology analysis. Takes a CSV of animal fixes and a YAML config file, and produces a full suite of spatial, behavioural, and seasonal outputs — figures, tables, and GIS-ready layers — with no code editing required.

Tested on African lions, mountain caribou, and African elephants.

Quickstart

pip install -r requirements.txt
python run.py config.yaml

To validate your config and data without running the full analysis:

python run.py config.yaml --dry-run

Installation

Python 3.10 or later.

pip install -r requirements.txt

# Optional — more robust clustering for multi-species datasets
pip install hdbscan

Input data

A CSV file with at minimum:

Column	Description
timestamp	Fix datetime (any parseable format, UTC or naive)
individual ID	Animal identifier
longitude	Decimal degrees
latitude	Decimal degrees

Column names are mapped in config.yaml — the file is never modified.

Configuration

All parameters live in config.yaml. The key sections:

Required

data_path:     data/lion.csv
timestamp_col: timestamp
id_col:        individual-local-identifier
lon_col:       location-long
lat_col:       location-lat

Coordinate reference system

input_crs:  EPSG:4326    # CRS of your raw GPS data (almost always WGS84)
metric_crs: EPSG:32734   # Projected CRS for distances and areas
                         # Find your UTM zone at epsg.io

Temporal resampling

resample_interval_minutes: 60

Resamples each individual to one fix per interval before all analysis. Eliminates fix-rate bias when comparing datasets with different collar schedules. Set to 0 to disable.

Clustering

dbscan_eps_method:     knn    # data-driven eps from point-cloud density
dbscan_knn_percentile: 90.0   # raise → fewer clusters, lower → more clusters
use_hdbscan:           false  # set true for variable-shape clusters (requires pip install hdbscan)

The knn method derives eps from each individual's own nearest-neighbour distance distribution. It is scale-free and works across species without tuning. The alternative adaptive method (eps = median_step × k) is available for back-compatibility.

Multi-dataset runs

dataset_name: caribou_bc
output_dir:   outputs/{dataset_name}   # outputs go to outputs/caribou_bc/

Run separate config files per species — outputs never overwrite each other.

Outputs

Every qualified individual (above min_fixes_per_individual) gets its own set of figures. Population-level figures cover all individuals with appropriate adaptive scaling.

Per-individual figures

Figure	Contents
`fig1_space_use_<id>.png`	GPS trajectory + KDE utilisation distribution + MCP 95% boundary
`fig2_kde_heatmap_<id>.png`	KDE density surface with 95% UD contour
`fig3_dbscan_<id>.png`	Spatial clusters (adaptive DBSCAN or HDBSCAN)
`fig4_seasonal_kde_<id>.png`	Wet-season vs dry-season KDE side by side

Population figures

Figure	Contents
`fig5_behaviour_space.png`	GMM state space — hexbin density per behavioural state
`fig6_behaviour_vs_space.png`	Behaviour composition in spatial clusters vs noise
`fig7_behaviour_vs_water.png`	Distance to water by behavioural state (requires water_path)
`fig8_contraction.png`	Seasonal home range contraction — scatter + ranked bar chart
`fig9_mcp_comparison.png`	MCP 100% vs 95% per individual
`fig10_water_seasonal.png`	Water distance distributions, wet vs dry (requires water_path)
`fig11_individual_heatmap.png`	Per-individual metric heatmap (z-scored across population)

All population figures scale automatically to the number of individuals — font sizes, figure height, label spacing, and bar dimensions adapt so labels never overlap whether you have 5 individuals or 260.

Tables and GIS layers

File	Contents
`mcp_comparison.csv`	MCP 100% and 95% home range areas per individual
`individual_summary.csv`	Full metric table: KDE areas, behaviour proportions, cluster statistics
`clusters.geojson`	Convex hull polygon per cluster per individual — loads directly in QGIS or ArcGIS

Run records

File	Contents
`run.log`	Timestamped log of every stage: parameters, warnings, statistical results
`config_snapshot.yaml`	Exact copy of the config used for this run
`run_metadata.json`	Pipeline version, elapsed time, full config

Methods

Home range — MCP: 100% minimum convex polygon and 95% trimmed MCP per individual and per season.

Home range — KDE: Fixed-bandwidth kernel density estimation (bw_method=0.3) on up to kde_max_pts fixes, evaluated on a kde_grid_size × kde_grid_size grid. Bandwidth is fixed rather than Scott's rule because Scott's oversmooths for large N, inflating KDE areas toward MCP size.

Spatial clustering — DBSCAN: Run per individual with a data-driven eps derived from the k-nearest-neighbour distance distribution (dbscan_eps_method: knn). This scales naturally to each individual's movement grain and spatial scale without assumptions about step length or fix rate. Optionally replaced by HDBSCAN for datasets with variable cluster shapes.

Behavioural states — GMM: Gaussian Mixture Model on log-speed and absolute turning angle. Number of components selected by BIC from gmm_k_list. State mapping is deterministic: transit = highest log-speed component; rest = lowest turning angle among the remainder; tortuous = highest turning angle. Works correctly for K=2 (transit + rest) and K=3 (adds tortuous) — all downstream figures adapt automatically.

Seasonal analysis: User-defined wet months. Wilcoxon signed-rank test on paired wet/dry KDE areas. Individuals below kde_min_fixes in a season receive NaN for that season's area and are excluded from the test.

Environmental analysis: Nearest-feature distance to water bodies via sjoin_nearest. Aggregated by season and behavioural state. Fully conditional — omit water_path from config to skip.

Warnings

The pipeline logs automatic warnings for degenerate clustering results:

All noise — individual has too few fixes relative to min_samples within the computed eps radius. Usually indicates a genuinely sparse tracking schedule. Raise min_fixes_per_individual to exclude these individuals, or lower dbscan_min_samples.
Single cluster — entire home range fits inside the eps radius. Lower dbscan_knn_percentile to 75–85 for small-range individuals.
>85% noise — eps may be too small. Raise dbscan_knn_percentile or switch to use_hdbscan: true.

Troubleshooting

Missing required columns — column name mappings must match CSV headers exactly (case-sensitive). The log prints available columns on failure.

KDE very slow — reduce kde_max_pts (default 5000) or kde_grid_size (default 200). Set n_jobs: -1 to parallelise per-individual computation.

GMM produces K=2 unexpectedly — BIC found no evidence for three components. Check log for BIC values. If needed, force K=3 by setting gmm_k_list: [3]. Note that forcing K when data do not support it increases classification entropy.

CRS issues — set input_crs explicitly in config.yaml and run --dry-run to verify the logged CRS values before committing to a full run.

Outputs overwriting between runs — use output_dir: outputs/{dataset_name} with a unique dataset_name per config file.

File structure

.
├── run.py
├── config.yaml          ← copy and edit for each dataset
├── requirements.txt
├── data/
│   ├── fixes.csv
│   └── water.geojson    (optional)
└── outputs/
    ├── fig1_space_use_<id>.png   (one per individual)
    ├── fig2_kde_heatmap_<id>.png
    ├── fig3_dbscan_<id>.png
    ├── fig4_seasonal_kde_<id>.png
    ├── fig5_behaviour_space.png
    ├── ...
    ├── clusters.geojson
    ├── mcp_comparison.csv
    ├── individual_summary.csv
    ├── run.log
    ├── config_snapshot.yaml
    └── run_metadata.json

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
docs		docs
examples		examples
DATA_LICENSE.md		DATA_LICENSE.md
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
gitignore		gitignore
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Movement Analysis CLI

Quickstart

Installation

Input data

Configuration

Required

Coordinate reference system

Temporal resampling

Clustering

Multi-dataset runs

Outputs

Per-individual figures

Population figures

Tables and GIS layers

Run records

Methods

Warnings

Troubleshooting

File structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Movement Analysis CLI

Quickstart

Installation

Input data

Configuration

Required

Coordinate reference system

Temporal resampling

Clustering

Multi-dataset runs

Outputs

Per-individual figures

Population figures

Tables and GIS layers

Run records

Methods

Warnings

Troubleshooting

File structure

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages