DO NOT MERGE - pin vllm to main commit 9ea3a4015 for FP8 MoE+LoRA fix (vllm#42120) by JohannesHa · Pull Request #2849 · PrimeIntellect-ai/prime-rl

JohannesHa · 2026-06-21T22:12:13Z

Swap the v0.22.0 release wheels for per-commit cu129 wheels from wheels.vllm.ai built at 9ea3a4015b41 (merge of vllm-project/vllm#42120), which fixes FP8 MoE + LoRA output corruption and base-model contamination. This unblocks LoRA targeting FP8 MoE experts under expert parallelism on the GLM-5.1 stack. Revert to a tagged release once the fix lands in one.

vllm: 0.22.0+cu129 -> 0.23.1rc1.dev189+g9ea3a4015.cu129. uv.lock regenerated (this main build also bumps flashinfer, compressed-tensors, starlette and CUDA toolkit deps transitively). Wheel URLs use %2B-encoded '+' as required by wheels.vllm.ai.

Swap the v0.22.0 release wheels for per-commit cu129 wheels from wheels.vllm.ai built at 9ea3a4015b41 (merge of vllm-project/vllm#42120), which fixes FP8 MoE + LoRA output corruption and base-model contamination. This unblocks LoRA targeting FP8 MoE experts under expert parallelism on the GLM-5.1 stack. Revert to a tagged release once the fix lands in one. vllm: 0.22.0+cu129 -> 0.23.1rc1.dev189+g9ea3a4015.cu129. uv.lock regenerated (this main build also bumps flashinfer, compressed-tensors, starlette and CUDA toolkit deps transitively). Wheel URLs use %2B-encoded '+' as required by wheels.vllm.ai. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

vllm moved get_max_tokens from vllm.entrypoints.utils to vllm.entrypoints.serve.utils.api_utils (same signature). The old module was removed in the 0.23.1rc1 main build pinned for the #42120 FP8 MoE+LoRA fix, which crashed the inference APIServer at import time: ModuleNotFoundError: No module named 'vllm.entrypoints.utils' Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

JohannesHa and others added 2 commits June 21, 2026 14:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DO NOT MERGE - pin vllm to main commit 9ea3a4015 for FP8 MoE+LoRA fix (vllm#42120)#2849

DO NOT MERGE - pin vllm to main commit 9ea3a4015 for FP8 MoE+LoRA fix (vllm#42120)#2849
JohannesHa wants to merge 2 commits into
mainfrom
pin-vllm-moe-lora-fp8

JohannesHa commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JohannesHa commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant