TinyIntent Deployment Guide

This guide covers deployment options for TinyIntent in various environments.

Prerequisites

Python 3.11+
Docker (for containerized deployment)
Ollama server
At least 4GB RAM and 2GB free disk space
CoreML model artifacts (generated via make learn - see Model Generation section)

Environment Configuration

Copy the example environment file:
```
cp .env.example .env
```
Edit .env with your configuration:
- Set a strong TINYINTENT_SECRET (minimum 32 characters)
- Configure Ollama URL if not using localhost
- Set TINYINTENT_EXECUTION_ENABLED=1 to enable helper execution
- Adjust rate limits and resource constraints as needed

Model Generation (Required Before Deployment)

⚠️ CRITICAL: TinyIntent requires CoreML model artifacts before the bridge can start. These models are not included in the repository and must be generated locally.

Generate Required Models

# Generate both SmallIntent.mlmodel and TinyIntent.mlmodel
make learn

# Verify models were created
make doctor

# Expected output:
#   ✅ SmallIntent.mlmodel present (macOS optimized, ≤52MB)
#   ✅ TinyIntent.mlmodel present (mobile optimized, ≤10MB)

Model Files Required

The following files must exist in router/ before starting the bridge:

router/SmallIntent.mlmodel - Main router model for macOS (ANE accelerated)
router/TinyIntent.mlmodel - Compact version for mobile/constrained environments
router/train_summary.json - Training metadata and performance metrics

If these files are missing, the bridge will fail to start with model loading errors.

First-Time Setup

# 1. Install dependencies
pip install -r requirements.txt

# 2. Generate models (this will take several minutes)
make learn

# 3. Verify system readiness
make doctor

# 4. Start the bridge service
make bridgesrv

Development Deployment

Install dependencies:
```
pip install -r requirements.txt
```

Start the service:

make bridgesrv

python -m tinyintent.bridge.tinyrpc

Verify deployment:
```
curl http://localhost:8787/healthz
```

Production Deployment

Option 1: Docker Compose (Recommended)

Set environment variables:

export TINYINTENT_SECRET="your-production-secret-key-here"
export GRAFANA_PASSWORD="your-grafana-password"

Start all services:
```
docker-compose up -d
```

Check service health:

docker-compose ps
curl http://localhost:8787/healthz

Option 2: Systemd Service

Create service user:

sudo useradd --system --create-home --shell /bin/false tinyintent

Install application:

sudo -u tinyintent git clone https://github.com/tinyintent/tinyintent.git /home/tinyintent/app
cd /home/tinyintent/app
sudo -u tinyintent python -m venv venv
sudo -u tinyintent ./venv/bin/pip install -r requirements.txt

Create systemd service file /etc/systemd/system/tinyintent.service:

[Unit]
Description=TinyIntent Bridge Service
After=network.target
Requires=network.target

[Service]
Type=exec
User=tinyintent
Group=tinyintent
WorkingDirectory=/home/tinyintent/app
Environment=PATH=/home/tinyintent/app/venv/bin
EnvironmentFile=/home/tinyintent/app/.env
ExecStart=/home/tinyintent/app/venv/bin/python -m tinyintent.bridge.tinyrpc
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
RestartSec=5

# Security settings
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/home/tinyintent/app/data /home/tinyintent/app/logs

[Install]
WantedBy=multi-user.target

Enable and start service:

sudo systemctl daemon-reload
sudo systemctl enable tinyintent
sudo systemctl start tinyintent

Option 3: Manual Production Setup

Install Python dependencies:
```
pip install -r requirements.txt
```
Create production directories:
```
mkdir -p data/episodes logs backups
```
Set environment variables in .env file

Start with production settings:

TINYINTENT_ENVIRONMENT=production \
TINYINTENT_WORKERS=4 \
TINYINTENT_BIND=0.0.0.0 \
python -m tinyintent.bridge.tinyrpc

Security Considerations

Authentication

Use a strong, randomly generated TINYINTENT_SECRET
Disable ALLOW_DEV_LOCAL in production
Use HTTPS in production (reverse proxy recommended)

Network Security

Bind to localhost (127.0.0.1) if using a reverse proxy
Use firewall rules to restrict access
Consider VPN for remote access

Helper Execution

Start with EXECUTION_ENABLED=0 until security review is complete
Review all helpers before enabling execution
Set appropriate resource limits for helpers
Monitor helper execution logs

File Permissions

Run as non-root user
Restrict file system access using sandboxing
Regular backup of audit logs and episode data

Model Training and Deployment

Generating CoreML Models

The TinyIntent router uses locally trained CoreML models for intent classification. Generate them using:

# Full automated learning pipeline
make learn

# Individual steps
make router-train    # Train PyTorch → ONNX → CoreML conversion
make router-eval     # Evaluate model performance
make promote         # Promote if evaluation passes criteria

This produces versioned CoreML artifacts:

router/SmallIntent.mlmodel - macOS optimized (~70-85MB INT8 CoreML)
router/TinyIntent.mlmodel - Mobile optimized (≤5MB)
router/train_summary.json - Training metadata and metrics

Training Status Monitoring

Check training and model status via API:

# Get comprehensive training summary
curl -H "X-TinyIntent-Secret: $TINYINTENT_SECRET" \
  http://localhost:8787/router/train_summary

# Example response includes:
# - Training metrics (accuracy, precision/recall, F1)
# - Model artifact status and file sizes
# - Calibration statistics and confidence curves
# - Deployment readiness indicators
# - Sample predictions with confidence scores

Model Validation Gates

The training pipeline includes automatic validation:

Size constraints: SmallIntent 50-100MB (target 70-85MB), TinyIntent ≤5MB
Performance gates: Minimum accuracy, F1 score thresholds
Calibration quality: ECE (Expected Calibration Error) limits
Latency requirements: P95 inference time under 50ms

Deployment Prerequisites

Before enabling the router in production:

Verify model artifacts exist:

ls -la router/*.mlmodel
# Should show SmallIntent.mlmodel and TinyIntent.mlmodel

Check model performance:

curl -s -H "X-TinyIntent-Secret: $SECRET" \
  http://localhost:8787/router/train_summary | \
  jq '.evaluation_results.promotion_eligible'

Validate model sizes meet constraints:

curl -s -H "X-TinyIntent-Secret: $SECRET" \
  http://localhost:8787/router/train_summary | \
  jq '.current_models'

Training Data Management

Training data is stored in router/data/intents.tsv
Episode data can be exported using make export-episodes
Models are retrained automatically via make learn
Training metadata includes unique training IDs for versioning

Monitoring and Maintenance

Health Checks

GET /healthz - Basic health check
GET /readyz - Readiness check
GET /doctor - Comprehensive system health
GET /router/train_summary - Training status and model artifacts

Log Management

Audit logs rotate automatically
Monitor disk space for episode data
Set up log retention policies

Backup Strategy

# Backup episode data
make backup-data

# Manual backup
tar -czf backup-$(date +%Y%m%d).tar.gz data/ logs/

Updates

Stop the service
Backup current installation
Pull latest changes
Update dependencies
Run database migrations if needed
Restart service
Verify health

Troubleshooting

Service Won't Start

Check environment variables
Verify Ollama connectivity
Check file permissions
Review logs for errors

Performance Issues

Monitor resource usage
Check helper execution patterns
Review rate limiting settings
Consider scaling with multiple workers

Helper Execution Problems

Check helper manifests
Verify sandbox constraints
Review capability permissions
Check environment variables

Configuration Reference

Environment Variables

Variable	Default	Description
`TINYINTENT_SECRET`	Required	Authentication secret
`TINYINTENT_ENVIRONMENT`	development	Environment mode
`TINYINTENT_BIND`	127.0.0.1	Server bind address
`TINYINTENT_PORT`	8787	Server port
`TINYINTENT_EXECUTION_ENABLED`	0	Enable helper execution
`HELPER_RATE_LIMIT`	10	Helper executions per minute
`MODEL_OLLAMA_URL`	http://localhost:11434	Ollama server URL

See .env.example for complete configuration options.

Support

For deployment issues:

Check the troubleshooting section
Review application logs
Run system health checks
Consult the project documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TinyIntent Deployment Guide

Prerequisites

Environment Configuration

Model Generation (Required Before Deployment)

Generate Required Models

Model Files Required

First-Time Setup

Development Deployment

Production Deployment

Option 1: Docker Compose (Recommended)

Option 2: Systemd Service

Option 3: Manual Production Setup

Security Considerations

Authentication

Network Security

Helper Execution

File Permissions

Model Training and Deployment

Generating CoreML Models

Training Status Monitoring

Model Validation Gates

Deployment Prerequisites

Training Data Management

Monitoring and Maintenance

Health Checks

Log Management

Backup Strategy

Updates

Troubleshooting

Service Won't Start

Performance Issues

Helper Execution Problems

Configuration Reference

Environment Variables

Support

FilesExpand file tree

DEPLOYMENT.md

Latest commit

History

DEPLOYMENT.md

File metadata and controls

TinyIntent Deployment Guide

Prerequisites

Environment Configuration

Model Generation (Required Before Deployment)

Generate Required Models

Model Files Required

First-Time Setup

Development Deployment

Production Deployment

Option 1: Docker Compose (Recommended)

Option 2: Systemd Service

Option 3: Manual Production Setup

Security Considerations

Authentication

Network Security

Helper Execution

File Permissions

Model Training and Deployment

Generating CoreML Models

Training Status Monitoring

Model Validation Gates

Deployment Prerequisites

Training Data Management

Monitoring and Maintenance

Health Checks

Log Management

Backup Strategy

Updates

Troubleshooting

Service Won't Start

Performance Issues

Helper Execution Problems

Configuration Reference

Environment Variables

Support