This repository contains the Middleware Harvester, a core component of the FAIRagro advanced middleware architecture. It acts as an orchestrator that runs specialized harvesting plugins (like the INSPIRE-to-ARC converter). It enables Research Data Infrastructure (RDI) providers to harvest metadata from standardized sources (like CSW), transform them into standardized Annotated Research Context (ARC) objects, and transmit them to the central FAIRagro Middleware API.
| Path | Description |
|---|---|
middleware/harvester/ |
Source code of the central orchestrator and plugin contract. |
middleware/inspire/ |
Source code of the INSPIRE-to-ARC harvester plugin. |
docs/ |
Architectural design, mapping specifications, and AI workflow. |
spec/ |
Project-level architecture and design (cross-cutting concerns). |
dev_environment/ |
Docker-based local development setup (Mock API, Harvester). |
scripts/ |
Tooling for quality checks, environment setup, and Git LFS. |
docker/ |
Dockerfiles and container structure tests. |
For the best out-of-the-box experience, you can run a complete local demonstration. This setup starts a local Mock Middleware API and the Harvester to process and save results locally:
# Start the full demo stack (requires Docker)
./dev_environment/start.shNote: Generated ARCs will be saved to dev_environment/demo_output/.
The preferred method for working with this repository is using the Dev Container (VS Code).
- Python 3.12+
- uv (Dependency Management & Workspace Orchestration)
- Docker & Docker Compose
- Git LFS (installed via
./scripts/setup-git-lfs.sh)
Clone the repository and install all workspace dependencies:
uv sync --all-packagesThe dev_environment folder provides a full stack including a Mock API. Please refer to the Development Environment README for detailed instructions.
Detailed information on how to use, configure, and deploy the specific components can be found in their respective subdirectories:
- Harvester Orchestrator README: Configuration (YAML/Env), CLI options, and orchestration loop.
- INSPIRE Plugin README: Metadata mapping rules and CSW connection settings.
- Architectural Design: Deep dive into the concurrency model and data flow.
- INSPIRE Mapping Spec: The rules for transforming INSPIRE/ISO19139 metadata into ARC objects.
This project uses Spec-Driven Development (SDD). Every feature and architectural decision is documented in spec/ (project-level) or middleware/*/spec/ (component-level) before or during implementation.
AI agents (like GitHub Copilot) use these specs along with AGENTS.md and .agents/skills/ to provide high-context assistance.
- See AI Agent Workflow for details on how to use agents effectively in this project.
We maintain high code quality through automated checks:
# Run all quality checks (Ruff, Mypy, Pylint, Bandit)
./scripts/quality-check.sh
# Run unit and integration tests
uv run pytest middleware/Maintained by: FAIRagro Middleware Team | License: LICENSE