From 0d15be305e4db3eeb46f1bd7339958b755a6055e Mon Sep 17 00:00:00 2001 From: qinxuye Date: Mon, 18 May 2026 13:16:45 +0800 Subject: [PATCH 1/2] doc: Add AI agent guidance for the project --- .gitignore | 2 + AGENTS.md | 180 +++++++++++++++++++++++++++++++++++++++++++++++++++++ CLAUDE.md | 1 + 3 files changed, 183 insertions(+) create mode 100644 AGENTS.md create mode 120000 CLAUDE.md diff --git a/.gitignore b/.gitignore index 5f5f195c3a..1e5b69be27 100644 --- a/.gitignore +++ b/.gitignore @@ -156,4 +156,6 @@ asv/results *.md !README.md !README_*.md +!AGENTS.md +!CLAUDE.md .history/ diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000000..1fa21e2f74 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,180 @@ +# AGENTS.md + +Guidance for AI coding agents working in this repository. + +## Project Overview + +Xinference is a Python model-serving project for language, embedding, rerank, +image, video, audio, and multimodal models. It exposes CLI commands, a Python +client, REST/OpenAI-compatible APIs, an xoscar-based distributed runtime, and a +React Web UI. + +Primary package and entry points: + +- `xinference/`: Python package. +- `xinference/deploy/cmdline.py`: CLI entry points for `xinference`, + `xinference-local`, `xinference-supervisor`, and `xinference-worker`. +- `xinference/core/`: supervisor/worker runtime and actor orchestration. +- `xinference/model/`: model families, engines, built-in model specs, and model + tests. +- `xinference/api/`: API server and OpenAI-compatible routes. +- `xinference/client/`: sync and async Python clients. +- `xinference/ui/web/ui/`: React Web UI. +- `doc/source/`: Sphinx documentation. +- `.github/workflows/python.yaml`: main lint and test CI. + +## Working Rules + +- Prefer small, focused changes that match the current module's style. +- Preserve backward compatibility. If a breaking change is unavoidable, document + the reason and add deprecation behavior where practical. +- Do not edit `xinference/thirdparty/` unless the task explicitly concerns + vendored code. +- Avoid broad refactors while fixing a localized bug. +- Keep public API behavior, request/response schemas, model registration names, + and CLI flags stable unless the task requires changing them. +- Add or update tests for behavior changes. For model-runtime changes, prefer + tests close to the affected model family under `xinference/model/**/tests/`. +- Use type hints for new Python code when practical; this project encourages + PEP 484 style annotations. +- Treat docs and examples as user-facing API. Keep command examples accurate. + +## Environment Setup + +Recommended local setup: + +```bash +conda create --name xinf python=3.12 nodejs +conda activate xinf +pip install -e ".[dev]" +``` + +Notes: + +- `setup.py develop/install/sdist` runs the Web UI build unless + `NO_WEB_UI=1` is set. +- For Python 3.12 and newer, CI installs `setuptools<82`; use the same pin if + packaging or editable installs fail. +- The project supports Python 3.10 through 3.13 in CI. +- Optional model engines have extras in `setup.cfg`, such as `transformers`, + `vllm`, `mlx`, `embedding`, `rerank`, `image`, `video`, and `audio`. + +## Formatting and Linting + +Python formatting and checks are managed through pre-commit: + +```bash +pip install pre-commit +pre-commit run --files +``` + +For a branch-wide check against upstream main: + +```bash +pre-commit run --from-ref=upstream/main --to-ref=HEAD --all-files +``` + +Configured hooks include Black, end-of-file/trailing-whitespace checks, Flake8, +isort, mypy with missing imports ignored, and codespell. Configuration lives in +`.pre-commit-config.yaml`, `pyproject.toml`, and `setup.cfg`. + +## Python Tests + +Run focused tests first: + +```bash +pytest -vv path/to/test_file.py +``` + +The broad CI-style non-GPU test command is approximately: + +```bash +pytest --timeout=3000 -W ignore::PendingDeprecationWarning -vv \ + --cov-config=setup.cfg --cov-report=xml --cov=xinference \ + --ignore xinference/core/tests/test_continuous_batching.py \ + --ignore xinference/model/image/tests/test_stable_diffusion.py \ + --ignore xinference/model/image/tests/test_got_ocr2.py \ + --ignore xinference/model/audio/tests \ + --ignore xinference/model/embedding/tests/test_integrated_embedding.py \ + --ignore xinference/model/llm/transformers/tests/test_tensorizer.py \ + --ignore xinference/model/llm/tests/test_llm_model.py \ + --ignore xinference/model/llm/vllm \ + --ignore xinference/model/llm/sglang \ + --ignore xinference/client/tests/test_client.py \ + --ignore xinference/client/tests/test_async_client.py \ + --ignore xinference/model/llm/mlx \ + xinference +``` + +Use narrower commands for daily development. Many model tests require large +dependencies, GPU, Metal, network access, or model downloads. + +## Frontend + +The Web UI is under `xinference/ui/web/ui` and uses React 18, Material UI, +react-router, i18next, ESLint, and Prettier. + +Common commands: + +```bash +cd xinference/ui/web/ui +npm ci +npm start +npm run build +npx eslint . +./node_modules/.bin/prettier --check . +``` + +Use `npm run format` only when you intentionally want ESLint autofixes and +Prettier writes across the UI tree. + +## Documentation + +Documentation source is in `doc/source`. + +Common docs dependencies are included in the `doc` extra: + +```bash +pip install -e ".[doc]" +cd doc +make html +``` + +When changing CLI behavior, API behavior, deployment behavior, or model support, +update the relevant documentation pages in `doc/source`. + +## Model and Runtime Conventions + +- Keep model-family logic inside the relevant `xinference/model//` + package. +- Keep built-in model metadata changes close to existing specs and tests. +- Be careful with lazy imports and optional dependencies. Import heavyweight + model libraries only where needed so unrelated installs still work. +- Preserve platform guards for Linux-only, CUDA-only, and macOS Metal/MLX paths. +- For distributed runtime changes, consider both local mode and + supervisor/worker mode. +- For OpenAI-compatible behavior, verify request/response fields and streaming + behavior against existing API and client tests. + +## CI Expectations + +The main CI workflow: + +- Runs `pre-commit run --all-files`. +- Runs UI `npm ci`, `npx eslint .`, and Prettier check. +- Tests Python 3.10 through 3.13 across Linux, macOS, and Windows. +- Has special GPU and macOS Metal jobs for model-specific paths. + +Before marking a change done, run the smallest meaningful validation command +that covers the behavior you changed, and mention any broader checks that were +not run because of environment cost or missing hardware. + +## Git and Review Hygiene + +- Keep commits scoped to the requested change. +- Do not rewrite or revert user changes in an existing worktree unless asked. +- If the active checkout is busy or on an unrelated branch, use a separate + worktree and a semantic branch name such as `fix/...`, `feat/...`, or + `docs/...`. +- In PR reviews, inspect current GitHub review threads before adding duplicate + comments. diff --git a/CLAUDE.md b/CLAUDE.md new file mode 120000 index 0000000000..47dc3e3d86 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file From 93cabadf29722b3722e593b2c992bb5578883ee8 Mon Sep 17 00:00:00 2001 From: qinxuye Date: Mon, 18 May 2026 14:42:14 +0800 Subject: [PATCH 2/2] doc: Address agent guidance review comments --- AGENTS.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 1fa21e2f74..ea564bce80 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,4 +1,4 @@ -# AGENTS.md +# AI Agent Guidance Guidance for AI coding agents working in this repository. @@ -122,7 +122,7 @@ npm ci npm start npm run build npx eslint . -./node_modules/.bin/prettier --check . +npx prettier --check . ``` Use `npm run format` only when you intentionally want ESLint autofixes and