[train] Use custom wheel for vllm-router for /chat/completions fix#1601
[train] Use custom wheel for vllm-router for /chat/completions fix#1601SumanthRH wants to merge 2 commits into
/chat/completions fix#1601Conversation
Signed-off-by: SumanthRH <sumanthrh@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a custom wheel for vllm-router to resolve an issue with the /chat/completions endpoint. The review feedback points out an inconsistency: the wheel is restricted to x86_64 architectures, which may cause users on other Linux platforms (such as ARM64) to unknowingly use the buggy version from PyPI. It is recommended to use a git source or provide wheels for other architectures to ensure the fix is applied consistently across all supported platforms.
|
It looks like the official PyPI wheel has some additional steps to package the wheel, and unfortunately the steps are not listed in the README. Seeing the following errors on CI (inspecting the vllm-router logs): Process vllm-router:
Traceback (most recent call last):
File "/home/ray/anaconda3/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/home/ray/anaconda3/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/ray/default/skyrl/backends/skyrl_train/inference_servers/vllm_router.py", line 44, in _run_router_with_logging
launch_router(router_args)
File "/home/ray/.cache/uv/builds-v0/.tmpkuayMU/lib/python3.12/site-packages/vllm_router/launch_router.py", line 52, in launch_router
raise e
File "/home/ray/.cache/uv/builds-v0/.tmpkuayMU/lib/python3.12/site-packages/vllm_router/launch_router.py", line 45, in launch_router
raise RuntimeError("Rust Router is not installed")
RuntimeError: Rust Router is not installed |
|
GPU CI is passing: https://github.com/NovaSky-AI/SkyRL/actions/runs/25903325504/job/76131304194?pr=1601 I believe we are good to merge. For now. Will track building a wheel for ARM64 in an issue. |
What does this PR do?
Addresses the issue with
/chat/completionsin the new inference codepath #1591 .The issue is that
vllm-routerdrops extra arguments (i.e arguments not in the OpenAI spec). I made a PR to fix this: vllm-project/router#162While we wait for a new vllm-router release, we should ensure that SkyRL integrations that rely on
/chat/completionsdon't break because of this.This PR moves our
vllm-routerdependency to use a custom wheel built with cherry-picking the fix on top of the latest release (0.1.14). Wheel is currently built just for x86_64 arch for now.