Skip to content

Commit 4289368

Browse files
committed
Switch AI inference path from Modal to Lightning-ready runtime
1 parent fc91390 commit 4289368

10 files changed

Lines changed: 512 additions & 202 deletions

.github/workflows/ai_trading_smoke.yml

Lines changed: 6 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -6,49 +6,35 @@ on:
66
jobs:
77
ai-smoke:
88
runs-on: ubuntu-latest
9-
timeout-minutes: 90
9+
timeout-minutes: 30
1010
env:
1111
PYTHONUNBUFFERED: "1"
12-
MODAL_TOKEN_ID: ${{ secrets.MODAL_TOKEN_ID }}
13-
MODAL_TOKEN_SECRET: ${{ secrets.MODAL_TOKEN_SECRET }}
1412
TWELVEDATA_API_KEYS: ${{ secrets.TWELVEDATA_API_KEYS }}
1513
ALPHAVANTAGE_API_KEYS: ${{ secrets.ALPHAVANTAGE_API_KEYS }}
1614
NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}
15+
TRAINED_MODEL_INFERENCE_URL: ${{ secrets.TRAINED_MODEL_INFERENCE_URL }}
16+
TRAINED_MODEL_API_KEY: ${{ secrets.TRAINED_MODEL_API_KEY }}
1717
AI_SMOKE_TICKERS: "AAPL"
18-
TRAINED_MODEL_BASE_MODEL: "Qwen/Qwen2.5-7B-Instruct"
19-
TRAINED_MODEL_ADAPTER_PATH: "_smoke_artifacts/lora_solid_adapter"
20-
HF_HUB_DISABLE_TELEMETRY: "1"
21-
HF_HUB_ENABLE_HF_TRANSFER: "1"
22-
TRAINED_MODEL_CPU_THREADS: "4"
2318
steps:
2419
- name: Checkout
2520
uses: actions/checkout@v4
2621

2722
- name: Set up Python
2823
uses: actions/setup-python@v5
2924
with:
30-
python-version: "3.11"
25+
python-version: "3.10"
3126

3227
- name: Install dependencies
3328
run: |
3429
pip install -r requirements.txt
35-
pip install modal hf_transfer
36-
pip install torch==2.4.1 --index-url https://download.pytorch.org/whl/cpu
37-
pip install "transformers>=4.46.0" "peft>=0.13.2" "accelerate>=1.0.1" "sentencepiece>=0.2.0"
38-
39-
- name: Fetch trained adapter from Modal volume
40-
run: |
41-
mkdir -p _smoke_artifacts/lora_solid_adapter
42-
modal volume get train-once-artifacts /lora_solid_adapter/adapter_model.safetensors _smoke_artifacts/lora_solid_adapter/adapter_model.safetensors
43-
modal volume get train-once-artifacts /lora_solid_adapter/adapter_config.json _smoke_artifacts/lora_solid_adapter/adapter_config.json
4430
4531
- name: Run AI-only smoke test
46-
run: python run_ai_trading_smoke_direct.py
32+
run: python run_ai_trading_smoke.py
4733

4834
- name: Upload AI smoke artifacts
4935
if: always()
5036
uses: actions/upload-artifact@v4
5137
with:
5238
name: ai-trading-smoke
53-
path: results/ai_smoke_direct_*.json
39+
path: results/ai_smoke_*.json
5440
retention-days: 7
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
name: Deploy Lightning Inference
2+
3+
on:
4+
workflow_dispatch:
5+
6+
jobs:
7+
deploy-lightning-inference:
8+
runs-on: ubuntu-latest
9+
timeout-minutes: 30
10+
env:
11+
LIGHTNING_USERNAME: ${{ secrets.LIGHTNING_USERNAME }}
12+
LIGHTNING_API_KEY: ${{ secrets.LIGHTNING_API_KEY }}
13+
LIGHTNING_USER_ID: ${{ secrets.LIGHTNING_USER_ID }}
14+
TRAINED_MODEL_BASE_MODEL: "Qwen/Qwen2.5-7B-Instruct"
15+
TRAINED_MODEL_NAME: "quant-trained-trading-model"
16+
TRAINED_MODEL_CPU_THREADS: "8"
17+
LIGHTNING_INFERENCE_COMPUTE_NAME: "cpu"
18+
LIGHTNING_INFERENCE_DISK_GB: "80"
19+
TRAINED_MODEL_API_KEY: ${{ secrets.TRAINED_MODEL_API_KEY }}
20+
TRAINED_MODEL_ADAPTER_ARCHIVE_URL: ${{ secrets.TRAINED_MODEL_ADAPTER_ARCHIVE_URL }}
21+
TRAINED_MODEL_ADAPTER_ARCHIVE_TOKEN: ${{ secrets.TRAINED_MODEL_ADAPTER_ARCHIVE_TOKEN }}
22+
steps:
23+
- name: Checkout
24+
uses: actions/checkout@v4
25+
26+
- name: Set up Python
27+
uses: actions/setup-python@v5
28+
with:
29+
python-version: "3.11"
30+
31+
- name: Install Lightning deploy dependencies
32+
run: |
33+
python -m pip install --upgrade pip
34+
python -m pip install lightning-app==2.3.2 lightning-cloud==0.5.70
35+
36+
- name: Lightning preflight
37+
run: python quant_platform/scripts/lightning_account_preflight.py
38+
39+
- name: Deploy inference service
40+
run: python deploy_lightning_inference.py

README.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,9 @@ trading_bot/
1717
├── main.py # Daily core bot + AI bot orchestration
1818
├── llm_trader.py # AI trading branch using the trained model
1919
├── trained_model_client.py # Remote HTTP client for trained-model inference
20-
├── modal_trained_model_service.py
20+
├── trained_model_service_runtime.py
21+
├── lightning_trained_model_app.py
22+
├── deploy_lightning_inference.py
2123
├── backtesting/ # Existing research stack in the bot repo
2224
└── quant_platform/ # Merged train-once quant platform repo
2325
```
@@ -26,27 +28,32 @@ trading_bot/
2628

2729
- Core bot remains unchanged in principle: price ingestion, feature generation, OLS ranking, meta-learner, portfolio logic
2830
- AI trading bot is separate and now uses the trained quant model over HTTP
29-
- The AI path is batched and designed to call the Modal CPU endpoint, not a local model
31+
- The AI path is batched and designed to call a remote CPU inference endpoint, not a local model
3032

3133
## Secrets
3234

3335
### Still used
3436
- `NVIDIA_API_KEY`: news sentiment path
35-
- `TRAINED_MODEL_INFERENCE_URL`: deployed Modal CPU inference URL for the AI trading bot
37+
- `TRAINED_MODEL_INFERENCE_URL`: deployed inference URL for the AI trading bot
3638
- `TRAINED_MODEL_API_KEY`: optional auth for the trained-model endpoint
3739
- `TWELVEDATA_API_KEYS`, `ALPHAVANTAGE_API_KEYS`: optional price providers
3840

3941
### No longer used by the AI trading bot
4042
- `NVIDIA_REASONING_API_KEY`
43+
- `MODAL_TOKEN_ID`
44+
- `MODAL_TOKEN_SECRET`
4145

4246
## Main Workflows
4347

4448
- `.github/workflows/daily_trading_bot.yml`
4549
- Daily root bot workflow
4650
- Core + AI orchestration
4751
- `.github/workflows/ai_trading_smoke.yml`
48-
- AI-only smoke test against the trained model endpoint
52+
- AI-only smoke test against the remote trained-model endpoint
4953
- Does not run the core strategy
54+
- `.github/workflows/deploy_lightning_inference.yml`
55+
- Deploys the trained-model inference service to Lightning AI
56+
- Leaves the core bot untouched
5057

5158
## AI-Only Smoke Test
5259

@@ -89,5 +96,6 @@ python run_ai_trading_smoke.py
8996
## Notes
9097

9198
- The AI bot is remote-only and expects the trained model to be served externally.
92-
- The current deployment target is Modal CPU.
99+
- The current deployment target is Lightning AI CPU.
100+
- The Lightning inference app can either mount a ready adapter directory or download a `tar.gz` / `.zip` archive via `TRAINED_MODEL_ADAPTER_ARCHIVE_URL`.
93101
- The core bot and AI bot remain logically separate even though they now live in one combined repo.

RUNBOOK.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@
44
- **Python 3.9+**
55
- **Dependencies**: `pandas`, `requests`, `pyyaml`, `yfinance`, `python-dotenv`
66
- **Email Configuration**: Gmail App Password required in `.env`.
7-
- **LLM (Optional)**: NVIDIA API key(s) in `.env` for LLM sentiment scoring and AI trade selection.
7+
- **News Sentiment (Optional)**: NVIDIA API key(s) in `.env` for news sentiment scoring.
8+
- **AI Trading Bot**: Remote trained-model inference endpoint URL in `.env` or GitHub secrets.
89

910
## 2. Setup
1011
1. Move the `trading_bot` folder to your desired location (e.g., home folder).
@@ -16,8 +17,11 @@
1617
```
1718
5. (Optional) NVIDIA keys:
1819
- `NVIDIA_API_KEY` enables News/LLM sentiment if `news.enabled: true` in `config.yaml`.
19-
- `NVIDIA_REASONING_API_KEY` enables the AI strategy trade selection (`ai_trading.enabled: true`).
20+
- `NVIDIA_REASONING_API_KEY` is no longer used by the AI trading bot.
2021
- Do not commit `.env` (it is gitignored).
22+
6. AI trading endpoint:
23+
- `TRAINED_MODEL_INFERENCE_URL` points the AI trading bot at the hosted trained-model service.
24+
- `TRAINED_MODEL_API_KEY` optionally protects that endpoint.
2125

2226
## 3. Daily Workflow
2327
The bot is fully automated:
@@ -45,5 +49,5 @@ Note: PineScript translation is a placeholder; set `strategy.type: pine` with a
4549
- **No Email**: Verify `SENDER_EMAIL` and `SENDER_PASSWORD` in `.env`.
4650
- **No Data**: Ensure internet connection is active (Wi-Fi check).
4751
- **AI Strategy Not Trading**:
48-
- If the AI LLM call fails, the run continues but new AI entries are blocked (to avoid hallucinated trades).
49-
- Check that `NVIDIA_REASONING_API_KEY` is set and the configured model is available.
52+
- If the trained-model endpoint call fails, the run continues but new AI entries are blocked.
53+
- Check that `TRAINED_MODEL_INFERENCE_URL` is set and the hosted service is healthy.

config.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,7 @@ ai_trading:
117117
inference_url: ""
118118
inference_url_env: "TRAINED_MODEL_INFERENCE_URL"
119119
api_key_env: "TRAINED_MODEL_API_KEY"
120+
# The endpoint is expected to be a remote CPU inference service (for example Lightning AI).
120121
timeout_seconds: 600
121122
model_name: "quant-trained-trading-model"
122123

deploy_lightning_inference.py

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
from __future__ import annotations
2+
3+
import argparse
4+
import json
5+
import os
6+
from pathlib import Path
7+
import sys
8+
9+
10+
ROOT_DIR = Path(__file__).resolve().parent
11+
QP_SRC_DIR = ROOT_DIR / "quant_platform" / "src"
12+
if str(QP_SRC_DIR) not in sys.path:
13+
sys.path.insert(0, str(QP_SRC_DIR))
14+
15+
from lightning_cloud_utils import ( # noqa: E402
16+
ensure_auth_env,
17+
find_app_by_name,
18+
get_client_and_project,
19+
json_safe,
20+
phase_name,
21+
set_process_env,
22+
)
23+
24+
from lightning_app.runners.runtime import dispatch # noqa: E402
25+
from lightning_app.runners.runtime_type import RuntimeType # noqa: E402
26+
27+
28+
ENV_KEYS = (
29+
"TRAINED_MODEL_BASE_MODEL",
30+
"TRAINED_MODEL_NAME",
31+
"TRAINED_MODEL_CPU_THREADS",
32+
"TRAINED_MODEL_CPU",
33+
"TRAINED_MODEL_API_KEY",
34+
"TRAINED_MODEL_ADAPTER_PATH",
35+
"TRAINED_MODEL_ADAPTER_ARCHIVE_URL",
36+
"TRAINED_MODEL_ADAPTER_ARCHIVE_TOKEN",
37+
"TRAINED_MODEL_CACHE_DIR",
38+
"LIGHTNING_INFERENCE_COMPUTE_NAME",
39+
"LIGHTNING_INFERENCE_DISK_GB",
40+
"LIGHTNING_INFERENCE_PORT",
41+
"TRAINED_MODEL_LOG_LEVEL",
42+
)
43+
44+
45+
def _collect_env() -> dict[str, str]:
46+
env_vars: dict[str, str] = {}
47+
for key in ENV_KEYS:
48+
value = os.getenv(key)
49+
if value:
50+
env_vars[key] = value
51+
return env_vars
52+
53+
54+
def main() -> None:
55+
parser = argparse.ArgumentParser()
56+
parser.add_argument("--app-name", default="trading-bot-lightning-inference")
57+
parser.add_argument("--blocking", action="store_true")
58+
parser.add_argument("--open-ui", action="store_true")
59+
args = parser.parse_args()
60+
61+
auth_env = ensure_auth_env()
62+
set_process_env(auth_env)
63+
client, project = get_client_and_project()
64+
65+
entrypoint = ROOT_DIR / "lightning_trained_model_app.py"
66+
env_vars = _collect_env()
67+
68+
dispatch(
69+
entrypoint,
70+
RuntimeType.CLOUD,
71+
start_server=False,
72+
no_cache=False,
73+
blocking=args.blocking,
74+
open_ui=args.open_ui,
75+
name=args.app_name,
76+
env_vars=env_vars,
77+
secrets={},
78+
)
79+
80+
latest = find_app_by_name(client, project.project_id, args.app_name)
81+
payload = {
82+
"project_id": project.project_id,
83+
"project_name": project.name,
84+
"app_name": args.app_name,
85+
"app_id": getattr(latest, "id", None) if latest else None,
86+
"phase": phase_name(latest) if latest else None,
87+
"note": "Copy the Lightning service URL from the app layout once the inference work is running.",
88+
}
89+
print(json.dumps(json_safe(payload), indent=2))
90+
91+
92+
if __name__ == "__main__":
93+
main()

lightning_trained_model_app.py

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
from __future__ import annotations
2+
3+
import os
4+
from pathlib import Path
5+
6+
from lightning_app import BuildConfig, CloudCompute, LightningApp, LightningFlow, LightningWork
7+
8+
9+
ROOT_DIR = Path(__file__).resolve().parent
10+
REQUIREMENTS_FILE = ROOT_DIR / "requirements-lightning-inference.txt"
11+
DEFAULT_COMPUTE_NAME = os.getenv("LIGHTNING_INFERENCE_COMPUTE_NAME", "cpu")
12+
DEFAULT_DISK_SIZE_GB = int(os.getenv("LIGHTNING_INFERENCE_DISK_GB", "80") or 80)
13+
DEFAULT_PORT = int(os.getenv("LIGHTNING_INFERENCE_PORT", "8000") or 8000)
14+
15+
16+
class TrainedModelInferenceWork(LightningWork):
17+
def __init__(self) -> None:
18+
build_config = BuildConfig(requirements=[str(REQUIREMENTS_FILE.resolve())])
19+
cloud_compute = CloudCompute(name=DEFAULT_COMPUTE_NAME, disk_size=DEFAULT_DISK_SIZE_GB)
20+
super().__init__(
21+
parallel=True,
22+
port=DEFAULT_PORT,
23+
raise_exception=False,
24+
cloud_build_config=build_config,
25+
cloud_compute=cloud_compute,
26+
)
27+
28+
def run(self) -> None:
29+
import uvicorn
30+
31+
os.environ.setdefault("TOKENIZERS_PARALLELISM", "false")
32+
uvicorn.run(
33+
"trained_model_service_runtime:app",
34+
host="0.0.0.0",
35+
port=self.port,
36+
log_level=os.getenv("TRAINED_MODEL_LOG_LEVEL", "info").lower(),
37+
)
38+
39+
40+
class RootFlow(LightningFlow):
41+
def __init__(self) -> None:
42+
super().__init__()
43+
self.inference = TrainedModelInferenceWork()
44+
45+
def run(self) -> None:
46+
self.inference.run()
47+
48+
def configure_layout(self):
49+
return [{"name": "trained-model-inference", "content": self.inference.url}]
50+
51+
52+
app = LightningApp(RootFlow())

0 commit comments

Comments
 (0)