Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
f74ff38
Bump version from 1.2.2 to 1.3.0
xinitd Apr 1, 2026
1b1eb51
Restructure project into Python package layout
xinitd Apr 5, 2026
fcb34c6
Add GitHub Actions workflow for smoke testing on Python 3.11-3.13
xinitd Apr 5, 2026
ef49939
Add CI status badge to README
xinitd Apr 5, 2026
b9efd60
Add '--exclude' flag to skip directories by name
xinitd Apr 7, 2026
2c5672b
Add '--zip' flag to compress output into archive after collection
xinitd Apr 7, 2026
f093d5c
Add forensic manifest with configurable hashing (sha256/sha512/md5)
xinitd Apr 8, 2026
c116137
Add forensic log with chain-of-custody event tracking
xinitd Apr 9, 2026
57d141e
Add '--resume' flag for checkpoint/resume from forensic log
xinitd Apr 9, 2026
608dd18
Update README and CLI help with new features for 1.3.0
xinitd Apr 9, 2026
5811cd9
Update terminal screenshot with new CLI options
xinitd Apr 9, 2026
32d54f6
Improve image alt text for search engine discoverability
xinitd Apr 9, 2026
e674115
Refactor imports in cli.py for clarity
xinitd Apr 9, 2026
04a1c32
Add context manager to forensic log class
xinitd Apr 9, 2026
adcdb88
Refactor forensic log handling and file copying logic
xinitd Apr 9, 2026
9ea8fd4
Add resume functionality and hash options to CLI
xinitd Apr 9, 2026
e3c5ce4
Refactor cli.py by removing main function
xinitd Apr 9, 2026
2d60af4
Update cli.py
xinitd Apr 9, 2026
f6f5a98
Update hash algorithm options in README
xinitd Apr 9, 2026
0eb31b0
Simplify target_extensions extraction
xinitd Apr 9, 2026
017727e
Handle OSError when reading file attributes
xinitd Apr 9, 2026
4b57350
Resolve source path before adding to completed files
xinitd Apr 9, 2026
6ba0dda
Fix path resolution for resumed files check
xinitd Apr 9, 2026
19594aa
Fix formatting in README for forensic manifest section
xinitd Apr 9, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: CI

on:
push:
branches: ['*']
pull_request:
branches: ['*']

jobs:
smoke-test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.11', '3.12', '3.13']

steps:
- uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Verify tool launches
run: python hypoxia.py --help

- name: Verify package entry point
run: python -m hypoxia --help
21 changes: 16 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<div align="center">

<img src="assets/logo.png" alt="Hypoxia logo" style="height: 256px; width: 256px; object-fit: contain;">
<img src="assets/logo.png" alt="Hypoxia - open-source CLI forensic file extraction tool for Linux, macOS, and Windows" style="height: 256px; width: 256px; object-fit: contain;">

<h2>HYPOXIA</h2>

Expand All @@ -11,6 +11,7 @@
·
<a href="#command-line-options">Command-Line Options</a>
</p>
<img alt="CI" src="https://github.com/xinitd/hypoxia/actions/workflows/ci.yml/badge.svg">
<img alt="GitHub Downloads (all assets, all releases)" src="https://img.shields.io/github/downloads/xinitd/hypoxia/total">
<img alt="GitHub contributors" src="https://img.shields.io/github/contributors/xinitd/hypoxia">
<img alt="GitHub Release" src="https://img.shields.io/github/v/release/xinitd/hypoxia">
Expand All @@ -20,7 +21,7 @@
<h3 align="center">About</h3>
</div>

<img src="assets/terminal.png" alt="Terminal">
<img src="assets/terminal.png" alt="Hypoxia CLI terminal output showing forensic file collection with SHA-256 hashing, directory exclusion, and checkpoint resume">

**Hypoxia** is a lightweight, dependency-free, cross-platform command-line tool designed for targeted file extraction and backup. Written entirely in standard Python, it recursively searches directories and collects files based on a granular set of criteria - including extensions, modification dates, and file sizes.

Expand All @@ -38,14 +39,20 @@ Built for efficiency and portability, Hypoxia is the perfect utility for digital
- **Size Boundaries:** e.g., files strictly between `10mb` and `2gb`.
- **Disk Space Awareness:** Monitors free space on the destination drive in real time, issuing warnings and safely halting execution before the disk fills up completely.
- **Metadata Control:** Choose to preserve original file metadata (timestamps, permissions) or discard it to maximize copy speed.
- **Secure & Robust:** Relies exclusively on Python's standard library (`argparse`, `pathlib`, `datetime`, `shutil`), ensuring maximum compatibility and minimizing security risks.
- **Forensic Manifest:** Automatically generates a JSON manifest for every collection task - SHA-256 hash, original path, destination path, file size, timestamps, and an overall manifest checksum for integrity verification.
- **Chain of Custody Log:** Append-only forensic log with timestamped entries for every action (file copied, skipped, errors), establishing a verifiable chain of custody.
- **Checkpoint/Resume:** If a collection is interrupted (crash, power loss, dying media), feed the forensic log back with `--resume` - Hypoxia continues from exactly where it stopped, verified by path and hash. No wasted time, no duplicates.
- **Archive Output:** Compress the entire collection into a `.zip` archive with a single flag.
- **Directory Exclusion:** Skip unwanted directories by name (e.g., system folders, `.git`).
- **Secure & Robust:** Relies exclusively on Python's standard library (`argparse`, `pathlib`, `datetime`, `shutil`, `hashlib`, `json`, `zipfile`), ensuring maximum compatibility and minimizing security risks.

<div align="center">
<h3 align="center">Use Cases</h3>
</div>

- **Digital Forensics:** Rapid evidence gathering and metadata extraction.
- **Data Backup:** Targeted backups of specific file types or recent documents.
- **Digital Forensics:** Rapid evidence gathering with SHA-256 hashing, forensic manifest, and chain-of-custody logging.
- **Incident Response:** Collect files from a compromised machine with a single command. Resume interrupted collections from dying media without re-copying.
- **Data Backup:** Targeted backups of specific file types or recent documents, with integrity verification built in.
- **Disaster Recovery:** Extracting files from corrupted or unbootable operating systems.

<div align="center">
Expand Down Expand Up @@ -106,6 +113,10 @@ This command preserves metadata by default and outputs detailed logs to the term
| `--date-to` | Filter for files modified on or before this date (`YYYY-MM-DD`). | No | - |
| `--size-min` | Minimum file size (e.g., `100mb`). Supported units: `b`, `kb`, `mb`, `gb`. | No | - |
| `--size-max` | Maximum file size (e.g., `2gb`). Supported units: `b`, `kb`, `mb`, `gb`. | No | - |
| `--exclude` | Comma-separated list of directory names to exclude from scan (e.g., `windows,program files,.git`). | No | - |
| `--zip` | Compress the output folder into a `.zip` archive after collection. | No | `false` |
| `--hash` | Hash algorithm for forensic manifest (`sha256`, `none`). | No | `sha256` |
| `--resume` | Path to a forensic log from a previous interrupted run. Resumes from where it stopped. | No | - |

<div align="center">
<h3 align="center">Legal Disclaimer</h3>
Expand Down
Binary file modified assets/terminal.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
118 changes: 1 addition & 117 deletions hypoxia.py
Original file line number Diff line number Diff line change
@@ -1,122 +1,6 @@
#!/usr/bin/env python3


import argparse
from argparse import RawTextHelpFormatter
import sys
import uuid
from pathlib import Path
from utils import *
from colors import info, error


__version__ = '1.2.2'


def dir_path(path_string):
path_obj = Path(path_string)
if path_obj.is_dir():
return path_obj
else:
raise argparse.ArgumentTypeError(f'Directory not found or access denied: "{path_string}"')


def main():
task_id = str(uuid.uuid4())
result = False

parser = argparse.ArgumentParser(
description='Hypoxia: Targeted file extraction and backup utility.',
epilog='''
Options Summary:
Logging level: -v, --verbosity
Target location: -s, --search-path
Target files: -e, --extensions
Copy behavior: -m, --keep-metadata
Timeframe filters: --date-from, --date-to
Size limits: --size-min, --size-max
''',
formatter_class=RawTextHelpFormatter
)

parser.add_argument(
'--version',
action='version',
version=f'%(prog)s {__version__}'
)
parser.add_argument(
'-v', '--verbosity',
choices=['silent', 'info'],
required=True,
help='Set logging level. "silent" suppresses output, "info" logs all actions.'
)
parser.add_argument(
'-s', '--search-path',
type=dir_path,
required=True,
help='Absolute or relative path to the target directory.'
)
parser.add_argument(
'-e', '--extensions',
type=str,
required=True,
help='Comma-separated list of target file extensions (e.g., pdf,docx,txt).'
)
parser.add_argument(
'-m', '--keep-metadata',
choices=['yes', 'no'],
default='yes',
help='Preserve original file metadata (timestamps, permissions). "no" speeds up copying.'
)
parser.add_argument(
'--date-from',
type=str,
required=False,
help='Filter for files modified on or after this date (YYYY-MM-DD).'
)
parser.add_argument(
'--date-to',
type=str,
required=False,
help='Filter for files modified on or before this date (YYYY-MM-DD).'
)
parser.add_argument(
'--size-min',
type=str,
required=False,
help='Minimum file size boundary (e.g., 10kb, 100mb, 2gb).'
)
parser.add_argument(
'--size-max',
type=str,
required=False,
help='Maximum file size boundary (e.g., 10kb, 100mb, 2gb).'
)

args = parser.parse_args()

verbosity = (args.verbosity == 'info')
keep_metadata = (args.keep_metadata == 'yes')

try:
target_extensions = args.extensions.split(',')
except Exception as e:
error('Invalid --extensions format. Expected a comma-separated list.')
sys.exit(1)

if verbosity:
info('Initializing Hypoxia...')
info(f'Task ID: {task_id}')

preparation_result = prepare_workspace(task_id, target_extensions, verbosity)
if preparation_result:
result = collect_files(
task_id, target_extensions, verbosity, keep_metadata, args.search_path, args.date_from, args.date_to, args.size_min, args.size_max
)

if result:
if verbosity:
info('Extraction task completed successfully.')
from hypoxia.cli import main


if __name__ == '__main__':
Expand Down
1 change: 1 addition & 0 deletions hypoxia/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__version__ = '1.3.0'
5 changes: 5 additions & 0 deletions hypoxia/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from hypoxia.cli import main


if __name__ == '__main__':
main()
156 changes: 156 additions & 0 deletions hypoxia/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
import argparse
from argparse import RawTextHelpFormatter
import sys
import uuid
from pathlib import Path
from hypoxia import __version__
from hypoxia.utils import prepare_workspace, collect_files, archive_output
from hypoxia.colors import info, error
from hypoxia.forensic import parse_resume_log


def dir_path(path_string):
path_obj = Path(path_string)
if path_obj.is_dir():
return path_obj
else:
raise argparse.ArgumentTypeError(f'Directory not found or access denied: "{path_string}"')


def main():
task_id = str(uuid.uuid4())
result = False

parser = argparse.ArgumentParser(
description='Hypoxia: Targeted file extraction and backup utility.',
epilog='''
Options Summary:
Logging level: -v, --verbosity
Target location: -s, --search-path
Target files: -e, --extensions
Copy behavior: -m, --keep-metadata
Timeframe filters: --date-from, --date-to
Size limits: --size-min, --size-max
Directory exclusion: --exclude
Archive output: --zip
Hashing: --hash
Resume: --resume
''',
formatter_class=RawTextHelpFormatter
)

parser.add_argument(
'--version',
action='version',
version=f'%(prog)s {__version__}'
)
parser.add_argument(
'-v', '--verbosity',
choices=['silent', 'info'],
required=True,
help='Set logging level. "silent" suppresses output, "info" logs all actions.'
)
parser.add_argument(
'-s', '--search-path',
type=dir_path,
required=True,
help='Absolute or relative path to the target directory.'
)
parser.add_argument(
'-e', '--extensions',
type=str,
required=True,
help='Comma-separated list of target file extensions (e.g., pdf,docx,txt).'
)
parser.add_argument(
'-m', '--keep-metadata',
choices=['yes', 'no'],
default='yes',
help='Preserve original file metadata (timestamps, permissions). "no" speeds up copying.'
)
parser.add_argument(
'--date-from',
type=str,
required=False,
help='Filter for files modified on or after this date (YYYY-MM-DD).'
)
parser.add_argument(
'--date-to',
type=str,
required=False,
help='Filter for files modified on or before this date (YYYY-MM-DD).'
)
parser.add_argument(
'--size-min',
type=str,
required=False,
help='Minimum file size boundary (e.g., 10kb, 100mb, 2gb).'
)
parser.add_argument(
'--size-max',
type=str,
required=False,
help='Maximum file size boundary (e.g., 10kb, 100mb, 2gb).'
)
parser.add_argument(
'--exclude',
type=str,
required=False,
help='Comma-separated list of directory names to exclude from scan (e.g., "windows,program files,.git").'
)
parser.add_argument(
'--zip',
action='store_true',
default=False,
help='Compress the output folder into a .zip archive after collection is complete.'
)
parser.add_argument(
'--hash',
type=str,
choices=['sha256', 'none'],
default='sha256',
help='Hash algorithm for forensic manifest (default: sha256). Use "none" to disable hashing.'
)
parser.add_argument(
'--resume',
type=str,
required=False,
help='Path to a forensic log file from a previous interrupted run. Resumes collection from where it stopped.'
)

args = parser.parse_args()

verbosity = (args.verbosity == 'info')
keep_metadata = (args.keep_metadata == 'yes')

target_extensions = [ext.strip() for ext in args.extensions.split(',')]

exclude_dirs = [d.strip().lower() for d in args.exclude.split(',')] if args.exclude else []

resumed_files = {}
if args.resume:
resume_path = Path(args.resume)
if not resume_path.exists():
error(f'Resume log not found: "{args.resume}"')
sys.exit(1)
if verbosity:
info(f'Resuming from: {args.resume}')
resumed_files = parse_resume_log(resume_path)
if verbosity:
info(f'Previously completed files: {len(resumed_files)}')

if verbosity:
info('Initializing Hypoxia...')
info(f'Task ID: {task_id}')

preparation_result = prepare_workspace(task_id, target_extensions, verbosity)
if preparation_result:
result = collect_files(
task_id, target_extensions, verbosity, keep_metadata, args.search_path, args.date_from, args.date_to, args.size_min, args.size_max, exclude_dirs, args.hash, resumed_files
)

if result:
if args.zip:
archive_path = archive_output(task_id, verbosity)
if verbosity:
info('Extraction task completed successfully.')
File renamed without changes.
Loading
Loading