chore(deps): update loader dependencies major (major)#194
Open
dreadnode-renovate-bot[bot] wants to merge 1 commit into
Open
chore(deps): update loader dependencies major (major)#194dreadnode-renovate-bot[bot] wants to merge 1 commit into
dreadnode-renovate-bot[bot] wants to merge 1 commit into
Conversation
07525d6 to
3ac3e72
Compare
3ac3e72 to
4daa5d1
Compare
3e0d62f to
4b95150
Compare
4b95150 to
40a28f1
Compare
85f7052 to
c4f4579
Compare
c4f4579 to
37b26b9
Compare
37b26b9 to
ca4e25e
Compare
ca4e25e to
b5496fe
Compare
b5496fe to
a845574
Compare
| datasource | package | from | to | | ---------- | ------------ | ------ | ----- | | pypi | psutil | 6.1.1 | 7.2.2 | | pypi | transformers | 4.57.6 | 5.9.0 |
a845574 to
f7682ea
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
==6.1.1→==7.2.2==4.57.6→==5.9.0Warning
Some dependencies could not be looked up. Check the Dependency Dashboard for more information.
Release Notes
giampaolo/psutil (psutil)
v7.2.2Compare Source
v7.2.1Compare Source
v7.2.0Compare Source
v7.1.3Compare Source
v7.1.2Compare Source
v7.1.1Compare Source
v7.1.0Compare Source
v7.0.0Compare Source
huggingface/transformers (transformers)
v5.9.0Compare Source
Release v5.9.0
New Model additions
Cohere2Moe
Command A+ is a Mixture-of-Experts (MoE) language model from Cohere that features a hybrid attention pattern combining sliding window and full attention layers. The model incorporates both shared and routed experts and supports a very large context window for processing extensive text sequences.
Links: Documentation
Parakeet tdt (#44171)
HRM-Text
HRM-Text is an improved autoregressive language-modeling variant of the Hierarchical Reasoning Model (HRM) that uses a hierarchical recurrent forward pass with two transformer stacks - one for slow, abstract planning (H) and one for fast, detailed computation (L) - reused inside a nested recurrence. It features PrefixLM attention where instruction tokens attend bidirectionally while response tokens attend causally, per-head sigmoid output gates, and parameterless RMSNorm. The model is designed as a base language model without instruction tuning or chat templates.
Links: Documentation | Paper
Breaking changes
The
text_embedsinput for SAM3, EdgeTAM, and SAM3-Lite-Text models now expects full text embeddings instead of just pooler outputs, aligning with other models in the library — users must update their inputs accordingly.Audio
Audio support was expanded with the addition of AudioFlamingoNext model checkpoints and improved compilability of audio/vision encoders via standalone pure functions. Additional improvements include better error messaging when loading audio from video files and new documentation for audio/video processors.
Generation
Fixed generation issues including
inputs_embedsandper_layer_inputshandling for Gemma4, anAttributeErrorin RAG'sgenerate()caused by missing config fields, and flaky VLM generation tests by blocking special image tokens during sampling.Bugfixes and improvements
masking_utils.py(#46066) by @Cyrilvallez in [#46066]huggingface.codomain in prose links (#46042) by @kiwigitops in [#46042]HRM Text] Add integration tests (#46033) by @vasqu in [#46033]_attn_implementationand fix request offset ingenerate_batch()(#45943) by @sergiopaniego in [#45943]per_layer_inputsfor every Gemma4 variants (#45927) by @Cyrilvallez in [#45927]Significant community contributions
The following contributors have made significant changes to the library over the last release:
v5.8.1: Patch release v5.8.1Compare Source
Patch release v5.8.1
This release is mainly to fix the Deepseek V4 integration!!!
v5.8.0: Release 5.8.0Compare Source
Release v5.8.0
New Model additions
DeepSeek-V4
DeepSeek-V4 is the next-generation MoE (Mixture of Experts) language model from DeepSeek that introduces several architectural innovations over DeepSeek-V3. The architecture replaces Multi-head Latent Attention (MLA) with a hybrid local + long-range attention design, swaps residual connections for Manifold-Constrained Hyper-Connections (mHC), and bootstraps the first few MoE layers with a static token-id → expert-id hash table. This implementation covers DeepSeek-V4-Flash, DeepSeek-V4-Pro, and their -Base pretrained variants, which share the same architecture but differ in width, depth, expert count and weights.
Links: Documentation | Paper
Gemma 4 Assistant
Gemma 4 Assistant is a small, text-only model that enables speculative decoding for Gemma 4 models using the Multi-Token Prediction (MTP) method and associated candidate generator. The model shares the same Gemma4TextModel backbone as other Gemma 4 models but uses KV sharing throughout the entire model, allowing it to reuse the KV cache populated by the target model and skip the pre-fill phase entirely. This architecture includes cross-attention to make the most of the target model's context, allowing the assistant to accurately predict more drafted tokens per drafting round.
Links: Documentation
GraniteSpeechPlus
Granite Speech Plus is a variant of Granite Speech that enhances the projector by consuming the concatenation of the encoder's final hidden states with an arbitrary subset of its intermediate hidden states along the feature dimension. It is a multimodal speech-to-text model that can transcribe audio, provide speaker annotation and word level timestamps by responding to text prompts. The model inherits the same architecture components as Granite Speech including the speech encoder, query transformer projector, language model, and optional LoRA adapter.
Links: Documentation
Granite4Vision
Granite Vision 4.1 is a vision-language model from IBM Research designed for enterprise-grade document data extraction. It specializes in chart extraction (Chart2CSV, Chart2Summary, Chart2Code), table extraction (JSON, HTML, OTSL), and semantic key-value pair extraction. The model builds on LLaVA-NeXT with architectural innovations including SigLIP2 Vision Encoder, Window Q-Former Projectors, and DeepStack Feature Injection with 8 vision-to-LLM injection points.
Links: Documentation
EXAONE-4.5
EXAONE 4.5 is the first open-weight vision language model developed by LG AI Research, integrating a dedicated visual encoder into the existing EXAONE 4.0 framework to expand multimodal capabilities. The model features 33 billion parameters in total, including 1.2 billion parameters from the vision encoder, and achieves competitive performance in general benchmarks while outperforming similar-sized models in document understanding and Korean contextual reasoning. It builds on EXAONE 4.0 with key enhancements including an expanded vocabulary of 153,600 tokens, support for up to 256K token context windows, and a Multi-Token Prediction (MTP) mechanism.
Links: Documentation | Paper | Blog Post
PP-FormulaNet
PP-FormulaNet-L and PP-FormulaNet_plus-L are lightweight models designed for table structure recognition, focusing on accurately recognizing table structures in documents and natural scenes. The models are part of the SLANet series and can be used for image-to-text tasks, specifically for detecting and processing mathematical formulas and table structures from images.
Links: Documentation
Breaking changes
Apex integration has been removed from the library (including RMSNorm usage in T5 and related models), so users relying on Apex for mixed precision or fused ops should migrate to PyTorch's native equivalents instead.
Tokenization
Fixed tokenizer mapping issues for DeepSeek R1 distilled (Qwen2) and DeepSeek OCR models, and resolved a significant performance regression in
PreTrainedTokenizer.convert_ids_to_tokenswhereskip_special_tokens=Truewas rebuilding the special token set on every iteration, resulting in a ~300x speedup for that code path.Bugfixes and improvements
concurrencytoPR CIworkflow file (pr-ci-caller.yml) (#45786) by @ydshieh in [#45786]text_configinAutoModelFor*.from_config(#45770) by @jamesbraza in [#45770]OAI Privacy Filter] Add integration test (#45725) by @vasqu in [#45725]Significant community contributions
The following contributors have made significant changes to the library over the last release:
v5.7.0Compare Source
Release v5.7.0
New Model additions
Laguna
Laguna is Poolside's mixture-of-experts language model family that extends standard SwiGLU MoE transformers with two key innovations. It features per-layer head counts allowing different decoder layers to have different query-head counts while sharing the same KV cache shape, and implements a sigmoid MoE router with auxiliary-loss-free load balancing that uses element-wise sigmoid of gate logits plus learned per-expert bias for router scoring.
Links: Documentation
DEIMv2
DEIMv2 (DETR with Improved Matching v2) is a real-time object detection model that extends DEIM with DINOv3 features and spans eight model sizes from X to Atto for diverse deployment scenarios. It uses a Spatial Tuning Adapter (STA) for larger variants to convert DINOv3's single-scale output into multi-scale features, while ultra-lightweight models employ pruned HGNetv2 backbones. The unified design achieves superior performance-cost trade-offs, with DEIMv2-X reaching 57.8 AP with only 50.3M parameters and DEIMv2-S being the first sub-10M model to exceed 50 AP on COCO.
Links: Documentation | Paper
Attention
Several attention-related bugs were fixed across multiple models, including a cross-attention cache type error in T5Gemma2 for long inputs, incorrect cached forward behavior in Qwen3.5's gated-delta-net linear attention, and a crash in GraniteMoeHybrid when no Mamba layers are present. Attention function dispatch was also updated to align with the latest model implementations.
Tokenizers
There was a bug in AutoTokenizer that caused the wrong tokenizer class to be initialized. This caused regressions in models like DeepSeek R1.
Generation
Continuous batching generation received several fixes and improvements, including correcting KV deduplication and memory estimation for long sequences (16K+), and removing misleading warnings about
num_return_sequencesand other unsupported features that were incorrectly firing even when functionality worked correctly. Documentation for per-request sampling parameters was also added.Kernels
Improved kernel support by fixing configuration reading and error handling for FP8 checkpoints (e.g., Qwen3.5-35B-A3B-FP8), enabling custom expert kernels registered from the HF Hub to be properly loaded, and resolving an incompatibility that prevented Gemma3n and Gemma4 from using the rotary kernel.
Bugfixes and improvements
x_clip: 8 failed test cases (#45394) by @kaixuanliu in [#45394]NameError: PeftConfigLiketriggered byPreTrainedModel.__init_subclass__(#45658) by @qgallouedec in [#45658]clean_up_tokenizationfor BPE tokenizers inPreTrainedTokenizerFast(#44915) by @maxsloef-goodfire in [#44915]supports_gradient_checkpointingtoNemotronHPreTrainedModel(#45625) by @sergiopaniego in [#45625]problem_type="single_label_classification"withnum_labels=1(#45611) by @gaurav0107 in [#45611]AttributeErrorons_aux=Noneinflash_attention_forward(#45589) by @jamesbraza in [#45589]Configuration
📅 Schedule: (UTC)
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
👻 Immortal: This PR will be recreated if closed unmerged. Get config help if that's undesired.
This PR has been generated by Mend Renovate.