chore(deps): update loader dependencies major (major) by dreadnode-renovate-bot[bot] · Pull Request #194 · dreadnode/dyana

dreadnode-renovate-bot · 2026-02-24T20:12:14Z

ℹ️ Note

This PR body was truncated due to platform limits.

This PR contains the following updates:

Package	Change	Age	Confidence
psutil	`==6.1.1` → `==7.2.2`
transformers	`==4.57.6` → `==5.9.0`

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.

Release Notes

giampaolo/psutil (psutil)

huggingface/transformers (transformers)

`v5.9.0`

Compare Source

Release v5.9.0

New Model additions

Cohere2Moe

Command A+ is a Mixture-of-Experts (MoE) language model from Cohere that features a hybrid attention pattern combining sliding window and full attention layers. The model incorporates both shared and routed experts and supports a very large context window for processing extensive text sequences.

Links: Documentation

Add new cohere2_moe model (#46115) by @Cyrilvallez in #46115

Parakeet tdt (#44171)

Parakeet tdt (#44171) by @lmaksym

HRM-Text

HRM-Text is an improved autoregressive language-modeling variant of the Hierarchical Reasoning Model (HRM) that uses a hierarchical recurrent forward pass with two transformer stacks - one for slow, abstract planning (H) and one for fast, detailed computation (L) - reused inside a nested recurrence. It features PrefixLM attention where instruction tokens attend bidirectionally while response tokens attend causally, per-head sigmoid output gates, and parameterless RMSNorm. The model is designed as a base language model without instruction tuning or chat templates.

Links: Documentation | Paper

Add hrm text (#46025) by @abcd1927 in #46025

Breaking changes

The text_embeds input for SAM3, EdgeTAM, and SAM3-Lite-Text models now expects full text embeddings instead of just pooler outputs, aligning with other models in the library — users must update their inputs accordingly.

🚨Fix memory leaks caused by lru decorators in vision models (#45922) by @yonigozlan

Audio

Audio support was expanded with the addition of AudioFlamingoNext model checkpoints and improved compilability of audio/vision encoders via standalone pure functions. Additional improvements include better error messaging when loading audio from video files and new documentation for audio/video processors.

user friendly error when loading audio from video (#45221) by @eustlb in [#45221]
[docs] adding audio/video processors (#45795) by @stevhliu in [#45795]
Support Audio Flamingo Next checkpoints (#44830) by @lashahub in [#44830]
Extract dynamic vision/audio tensors into standalone pure functions (#45396) by @IlyasMoutawwakil in [#45396]

Generation

Fixed generation issues including inputs_embeds and per_layer_inputs handling for Gemma4, an AttributeError in RAG's generate() caused by missing config fields, and flaky VLM generation tests by blocking special image tokens during sampling.

Fix Gemma4 generation from inputs_embeds and per_layer_inputs (#46049) by @Cyrilvallez in [#46049]
Fix AttributeError in RAG generate() for missing config fields (#46035) by @Sriniketh24 in [#46035]
Block image_start/end_token_id in generation test sampling (#45914) by @Rocketknight1 in [#45914]

Bugfixes and improvements

Remove mask visualization tool from masking_utils.py (#46066) by @Cyrilvallez in [#46066]
fix: owned_by field in GET /v1/models returns list instead of string (#46006) by @nileshpatil6 in [#46006]
[CB] Remove OpenTelemetry (#45984) by @remi-or in [#45984]
docs(readme): use canonical huggingface.co domain in prose links (#46042) by @kiwigitops in [#46042]
Fix remaining RAG doc examples that crash on current transformers (#46044) by @Sriniketh24 in [#46044]
Init the actual tensor, not a copy (#46030) by @Rocketknight1 in [#46030]
docs: sync legacy ACL anthology URLs and update metrics across i18n READMEs (#46027) by @irfaan101 in [#46027]
[MultimodalLM] add language_model to the get/set_input_embeddings logic (#46029) by @eustlb in [#46029]
[HRM Text] Add integration tests (#46033) by @vasqu in [#46033]
hy_v3: add XPU expectations (#45858) by @kaixuanliu in [#45858]
exaone4_5: add XPU expectations (#45890) by @kaixuanliu in [#45890]
hyperclovax: add XPU Expectations for CI test (#45926) by @kaixuanliu in [#45926]
chore(ci): remove dead env vars from circleci-failure-summary-comment.yml (#45972) by @XciD in [#45972]
[CB] [Major] Add tensor paralellism (#45821) by @remi-or in [#45821]
docs: update models architecture count and sync ACL anthology URLs (#46001) by @irfaan101 in [#46001]
bugfix(ci): avoid E2BIG in pr_slow_ci_suggestion (#45983) by @tarekziade in [#45983]
RFDetr - use correct Roboflow org for release (#45946) by @sbucaille in [#45946]
docs: Fix formatting issues in weightconverter.md (#45988) by @ArjunSrivastava1 in [#45988]
Fix colqwen2 test (#45981) by @IlyasMoutawwakil in [#45981]
Fix M-RoPE device mismatch in Qwen3VL family under FSDP2 CPU offload (#45861) by @jamesbraza in [#45861]
[docs] chat template prefill (#45947) by @stevhliu in [#45947]
[docs] decode fast path (#45899) by @stevhliu in [#45899]
fix: restore _attn_implementation and fix request offset in generate_batch() (#45943) by @sergiopaniego in [#45943]
Expose per_layer_inputs for every Gemma4 variants (#45927) by @Cyrilvallez in [#45927]
chore: update benchmark_v2.yml (#45966) by @hf-security-analysis[bot] in [#45966]
fix(ci): set persist-credentials: false on actions/checkout and close remaining template injection findings (#45964) by @XciD in [#45964]
chore(ci): set default workflow permissions to contents: read (#45961) by @XciD in [#45961]
fix(ci): remove template injection on pull_request_target workflows (#45956) by @XciD in [#45956]
chore(ci): pin all GitHub Actions and reusable workflows by SHA (#45955) by @XciD in [#45955]
[docs] ALMModelTest (#45900) by @stevhliu in [#45900]
Enhance apply_chat_template to support custom field prefilling (reasoning_content, thinking, etc.) (#45896) by @Mamiglia in [#45896]
BUGFIX: Support hubert models that don't have conv_pos_batch_norm configured (#45921) by @igordertigor in [#45921]
Revert 45777 (#45942) by @Rocketknight1 in [#45942]
pass the otel secrets (#45933) by @tarekziade in [#45933]
Add initial torch_tpu backend support (#45918) by @tengomucho in [#45918]
[CB] Hide activation footprint by using the CUDA graph pool (#45911) by @remi-or in [#45911]
Require input_ids for repetition penalty (#45389) by @ruben-aghayan in [#45389]
Fix undefined 'input' variable (#45895) by @fullyz in [#45895]
Fix post processing RF-DETR (#46041) by @yonigozlan (direct commit on v5.9.0)
[loading] Free up tensors faster inside ConversionOps (#46110) by @Cyrilvallez (direct commit on v5.9.0)
Add new cohere2_moe model (#46115) by @Cyrilvallez (direct commit on v5.9.0)
Fix cohere2 tp_plan for release by @Cyrilvallez (direct commit on v5.9.0)
Release v5.9.0 by @Cyrilvallez (direct commit on v5.9.0)

Significant community contributions

The following contributors have made significant changes to the library over the last release:

@lmaksym
- Parakeet tdt (#44171)
@eustlb
- user friendly error when loading audio from video (#45221)
- [MultimodalLM] add language_model to the get/set_input_embeddings logic (#46029)
@remi-or
- [CB] Remove OpenTelemetry (#45984)
- [CB] [Major] Add tensor paralellism (#45821)
- [CB] Hide activation footprint by using the CUDA graph pool (#45911)
@abcd1927
- Add hrm text (#46025)

`v5.8.1`: Patch release v5.8.1

Compare Source

Patch release v5.8.1

This release is mainly to fix the Deepseek V4 integration!!!

[fix] Add fatal_error to ContinuousBatchingManager so the serving... by @qgallouedec, @remi-or
Fix WeightConverter regex incorrectly matching shared_experts as experts by @silencelamb, @claude
Fix deepseek v4 by @ArthurZucker (#45892)
Deepseek v4 csa mask collapse by @ArthurZucker, @Sawyer117 (#45928)

`v5.8.0`: Release 5.8.0

Compare Source

Release v5.8.0

New Model additions

DeepSeek-V4

DeepSeek-V4 is the next-generation MoE (Mixture of Experts) language model from DeepSeek that introduces several architectural innovations over DeepSeek-V3. The architecture replaces Multi-head Latent Attention (MLA) with a hybrid local + long-range attention design, swaps residual connections for Manifold-Constrained Hyper-Connections (mHC), and bootstraps the first few MoE layers with a static token-id → expert-id hash table. This implementation covers DeepSeek-V4-Flash, DeepSeek-V4-Pro, and their -Base pretrained variants, which share the same architecture but differ in width, depth, expert count and weights.

Links: Documentation | Paper

Add DeepSeek V4 (#45643) by @ArthurZucker in #45643

Gemma 4 Assistant

Gemma 4 Assistant is a small, text-only model that enables speculative decoding for Gemma 4 models using the Multi-Token Prediction (MTP) method and associated candidate generator. The model shares the same Gemma4TextModel backbone as other Gemma 4 models but uses KV sharing throughout the entire model, allowing it to reuse the KV cache populated by the target model and skip the pre-fill phase entirely. This architecture includes cross-attention to make the most of the target model's context, allowing the assistant to accurately predict more drafted tokens per drafting round.

Links: Documentation

First model (#45788) by @SindhuRaghuram97 in #45788

GraniteSpeechPlus

Granite Speech Plus is a variant of Granite Speech that enhances the projector by consuming the concatenation of the encoder's final hidden states with an arbitrary subset of its intermediate hidden states along the feature dimension. It is a multimodal speech-to-text model that can transcribe audio, provide speaker annotation and word level timestamps by responding to text prompts. The model inherits the same architecture components as Granite Speech including the speech encoder, query transformer projector, language model, and optional LoRA adapter.

Links: Documentation

Support for a new Granite-Speech-Plus model (#45695) by @zvik in #45695

Granite4Vision

Granite Vision 4.1 is a vision-language model from IBM Research designed for enterprise-grade document data extraction. It specializes in chart extraction (Chart2CSV, Chart2Summary, Chart2Code), table extraction (JSON, HTML, OTSL), and semantic key-value pair extraction. The model builds on LLaVA-NeXT with architectural innovations including SigLIP2 Vision Encoder, Window Q-Former Projectors, and DeepStack Feature Injection with 8 vision-to-LLM injection points.

Links: Documentation

Add Granite 4.1 Vision (granite4_vision) (#45597) by @artem-spector in #45597

EXAONE-4.5

EXAONE 4.5 is the first open-weight vision language model developed by LG AI Research, integrating a dedicated visual encoder into the existing EXAONE 4.0 framework to expand multimodal capabilities. The model features 33 billion parameters in total, including 1.2 billion parameters from the vision encoder, and achieves competitive performance in general benchmarks while outperforming similar-sized models in document understanding and Korean contextual reasoning. It builds on EXAONE 4.0 with key enhancements including an expanded vocabulary of 153,600 tokens, support for up to 256K token context windows, and a Multi-Token Prediction (MTP) mechanism.

Links: Documentation | Paper | Blog Post

Add EXAONE 4.5 implementations (#45471) by @nuxlear in #45471

PP-FormulaNet

PP-FormulaNet-L and PP-FormulaNet_plus-L are lightweight models designed for table structure recognition, focusing on accurately recognizing table structures in documents and natural scenes. The models are part of the SLANet series and can be used for image-to-text tasks, specifically for detecting and processing mathematical formulas and table structures from images.

Links: Documentation

[Model] Add PP-FormulaNet Model Support (#45626) by @zhang-prog in #45626

Breaking changes

Apex integration has been removed from the library (including RMSNorm usage in T5 and related models), so users relying on Apex for mixed precision or fused ops should migrate to PyTorch's native equivalents instead.

🚨 Get rid of most Apex references (#45723) by @Rocketknight1

Tokenization

Fixed tokenizer mapping issues for DeepSeek R1 distilled (Qwen2) and DeepSeek OCR models, and resolved a significant performance regression in PreTrainedTokenizer.convert_ids_to_tokens where skip_special_tokens=True was rebuilding the special token set on every iteration, resulting in a ~300x speedup for that code path.

deepseek r1 distilled tokenizer fix for qwen2 mapping (#45741) by @itazap in [#45741]
DeepSeek OCR specifies an incorrect tokenizer class on the Hub (#45739) by @hmellor in [#45739]
PythonBackend slow tokenizer convert_ids_to_tokens fix (#45728) by @i3hz in [#45728]

Bugfixes and improvements

fix: correct spelling in continuous_api docstring (#45749) by @Dhruv908615 in [#45749]
Fix link to modular transformers documentation (#45746) by @SangbumChoi in [#45746]
Gemma4: fix failed test cases (#45568) by @kaixuanliu in [#45568]
Fix CI: Allow more artifacts to be download in CI (#45785) by @ydshieh in [#45785]
Add concurrency to PR CI workflow file (pr-ci-caller.yml) (#45786) by @ydshieh in [#45786]
Reorder decorators for autodoc and dataclass (#45702) by @zucchini-nlp in [#45702]
Unwrap text_config in AutoModelFor*.from_config (#45770) by @jamesbraza in [#45770]
fix: Added Mps support in float fallback backends list (#45687) by @rigen1048 in [#45687]
Github Actions PR CI (caller) (#45476) by @ydshieh in [#45476]
make sure we call check_auto in CI (#45775) by @tarekziade in [#45775]
Fix auto mapping script (#45774) by @Cyrilvallez in [#45774]
[MINISTRAL3] Fix conversion script yarn's apply_scale support. (#45744) by @juliendenize in [#45744]
[nemotron_h] respect _no_reinit flag on dt_bias and out_proj.weight (#45591) by @vai-minzhou in [#45591]
fix(utils): Resolve backbone utils test regressions (#45594) by @harshaljanjani in [#45594]
[CB] Better overall script and decode bucketting (#45653) by @remi-or in [#45653]
[docs] model testing (#45152) by @stevhliu in [#45152]
update dev (#45726) by @vasqu in [#45726]
Doc translate to Persian(farsi) (#45664) by @zeoses in [#45664]
[OAI Privacy Filter] Add integration test (#45725) by @vasqu in [#45725]
Speedup Qwen2VLImageProcessor (#45719) by @lgeiger in [#45719]
Remove dead beam-search dummies from dummy_pt_objects.py (#45722) by @jw9603 in [#45722]
chore(typing): add ty type checking for 10 utility files (#45703) by @moonbogi in [#45703]
Llama3 video fix (#45040) by @sywangyi in [#45040]
Fix custom-module copies inheriting read-only permissions (#45686) by @nurpax in [#45686]
Python code in model docs (#45608) by @zucchini-nlp in [#45608]
fix failed test cases for blt model (#45596) by @kaixuanliu in [#45596]
chore(typing): add ty type checking for 3 pipeline files (#45667) by @moonbogi in [#45667]

Significant community contributions

The following contributors have made significant changes to the library over the last release:

@artem-spector
- Add Granite 4.1 Vision (granite4_vision) (#45597)
@SindhuRaghuram97
- First model (#45788)
@nuxlear
- Add EXAONE 4.5 implementations (#45471)
@ArthurZucker
- Add DeepSeek V4 (#45643)
@remi-or
- [CB] Better overall script and decode bucketting (#45653)
@zhang-prog
- [Model] Add PP-FormulaNet Model Support (#45626)
@zvik
- Support for a new Granite-Speech-Plus model (#45695)

`v5.7.0`

Compare Source

Release v5.7.0

New Model additions

Laguna

Laguna is Poolside's mixture-of-experts language model family that extends standard SwiGLU MoE transformers with two key innovations. It features per-layer head counts allowing different decoder layers to have different query-head counts while sharing the same KV cache shape, and implements a sigmoid MoE router with auxiliary-loss-free load balancing that uses element-wise sigmoid of gate logits plus learned per-expert bias for router scoring.

Links: Documentation

Laguna XS.2 implementation (#45673) by @joerowell in #45673

DEIMv2

DEIMv2 (DETR with Improved Matching v2) is a real-time object detection model that extends DEIM with DINOv3 features and spans eight model sizes from X to Atto for diverse deployment scenarios. It uses a Spatial Tuning Adapter (STA) for larger variants to convert DINOv3's single-scale output into multi-scale features, while ultra-lightweight models employ pruned HGNetv2 backbones. The unified design achieves superior performance-cost trade-offs, with DEIMv2-X reaching 57.8 AP with only 50.3M parameters and DEIMv2-S being the first sub-10M model to exceed 50 AP on COCO.

Links: Documentation | Paper

model: Add DEIMv2 to Transformers (#44339) by @harshaljanjani in #44339

Attention

Several attention-related bugs were fixed across multiple models, including a cross-attention cache type error in T5Gemma2 for long inputs, incorrect cached forward behavior in Qwen3.5's gated-delta-net linear attention, and a crash in GraniteMoeHybrid when no Mamba layers are present. Attention function dispatch was also updated to align with the latest model implementations.

Fix cross-attention cache layer type for T5Gemma2 long inputs (#45540) by @Beichen-Ma in [#45540]
[Qwen3.5] Fix GDN linear attention multi-token cached forward (#45513) by @kashif in [#45513]
Fix GraniteMoeHybrid _update_mamba_mask crash on attention-only models (#45514) by @tianhaocui in [#45514]
Align latest model attention function dispatch (#45598) by @Cyrilvallez in [#45598]

Tokenizers

There was a bug in AutoTokenizer that caused the wrong tokenizer class to be initialized. This caused regressions in models like DeepSeek R1.

change got reverted (#45680) by @itazap in [#45680]

Generation

Continuous batching generation received several fixes and improvements, including correcting KV deduplication and memory estimation for long sequences (16K+), and removing misleading warnings about num_return_sequences and other unsupported features that were incorrectly firing even when functionality worked correctly. Documentation for per-request sampling parameters was also added.

generate: drop stale num_return_sequences warning on continuous batching path (#45582) by @joaquinhuigomez in [#45582]
Remove unnecessary generate warnings (#45619) by @Cyrilvallez in [#45619]
[CB] Changes for long generation (#45530) by @remi-or in [#45530]
[docs] per-request sampling params (#45553) by @stevhliu in [#45553]

Kernels

Improved kernel support by fixing configuration reading and error handling for FP8 checkpoints (e.g., Qwen3.5-35B-A3B-FP8), enabling custom expert kernels registered from the HF Hub to be properly loaded, and resolving an incompatibility that prevented Gemma3n and Gemma4 from using the rotary kernel.

Fix configuration reading and error handling for kernels (#45610) by @hmellor in [#45610]
Allow for registered experts from kernels hub (#45577) by @winglian in [#45577]
Gemma3n and Gemma4 cannot use rotary kernel (#45564) by @Cyrilvallez in [#45564]

Bugfixes and improvements

fixing more typos (#45689) by @vasqu in [#45689]
[docs] cb memory management (#45587) by @stevhliu in [#45587]
[docs] cpu offloading (#45660) by @stevhliu in [#45660]
docs(README_zh-hans): clarify conditions for not using Transformers (#45688) by @GuaiZai233 in [#45688]
fix padding side issue for fast_vlm tests (#45592) by @kaixuanliu in [#45592]
Fix x_clip: 8 failed test cases (#45394) by @kaixuanliu in [#45394]
zero_shot_object_detection ValueError fix for python 3.13 (#45669) by @AnkitAhlawat7742 in [#45669]
Fix pageable H2D copies in Gated DeltaNet PyTorch fallback (#45665) by @ruixiang63 in [#45665]
Fix UnboundLocalError in shard_and_distribute_module for replicated parameters (#45675) by @Abdennacer-Badaoui in [#45675]
[MistralCommonBackend] Soften validation mode and apply_chat_template arguments check (#45628) by @juliendenize in [#45628]
Fix NameError: PeftConfigLike triggered by PreTrainedModel.__init_subclass__ (#45658) by @qgallouedec in [#45658]
chore(typing): added modeling_utils to ty (#45425) by @tarekziade in [#45425]
[gemma4] infer from config instead of hardcoding (#45606) by @eustlb in [#45606]
Update quants tests (#45480) by @SunMarc in [#45480]
🔴🔴🔴 fix: skip clean_up_tokenization for BPE tokenizers in PreTrainedTokenizerFast (#44915) by @maxsloef-goodfire in [#44915]
Fix colmodernvbert tests (#45652) by @Cyrilvallez in [#45652]
[CB] [Major] Add CPU request offloading (#45184) by @remi-or in [#45184]
Fix peft constructors (#45622) by @Cyrilvallez in [#45622]
chore: speedup modular converter (~30%) (#45046) by @tarekziade in [#45046]
Fix whisper return language (#42227) by @FredHaa in [#42227]
Add supports_gradient_checkpointing to NemotronHPreTrainedModel (#45625) by @sergiopaniego in [#45625]
Raise clear error for problem_type="single_label_classification" with num_labels=1 (#45611) by @gaurav0107 in [#45611]
CircleCI with torch 2.11 (#45633) by @ydshieh in [#45633]
chore: bump doc-builder SHA for main doc build workflow (#45631) by @rtrompier in [#45631]
Allow more artifacts to be download in CI (#45629) by @ydshieh in [#45629]
chore(qa): split pipeline and add type checking (#45432) by @tarekziade in [#45432]
Skip failing offloading tests (#45624) by @Cyrilvallez in [#45624]
fix: compute auxiliary losses when denoising is disabled in D-FINE (#45601) by @Abineshabee in [#45601]
qa: bumped mlinter and allow local override (#45585) by @tarekziade in [#45585]
Processing Utils: continue when content is a string (#45605) by @RyanMullins in [#45605]
SonicMoe (#45433) by @IlyasMoutawwakil in [#45433]
fix transformers + torchao nvfp4 serialization (#45573) by @vkuzo in [#45573]
[AMD CI] Fix expectations for Gemma3n (#45602) by @Abdennacer-Badaoui in [#45602]
[docs] multi-turn tool calling (#45554) by @stevhliu in [#45554]
Fix AttributeError on s_aux=None in flash_attention_forward (#45589) by @jamesbraza in [#45589]
do not index past decoded chars with special tokens ([#45435](https://redirect.github.com/huggingface/transformers/issues/454

✂ Note

PR body was truncated to here.

Configuration

📅 Schedule: (UTC)

Branch creation
- At any time (no schedule defined)
Automerge
- At any time (no schedule defined)

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

👻 Immortal: This PR will be recreated if closed unmerged. Get config help if that's undesired.

If you want to rebase/retry this PR, check this box

This PR has been generated by Mend Renovate.

| datasource | package | from | to | | ---------- | ------------ | ------ | ----- | | pypi | psutil | 6.1.1 | 7.2.2 | | pypi | transformers | 4.57.6 | 5.9.0 |

dreadnode-renovate-bot Bot added the type/digest Dependency digest updates label Feb 24, 2026

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch 3 times, most recently from 07525d6 to 3ac3e72 Compare March 1, 2026 00:53

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch from 3ac3e72 to 4daa5d1 Compare March 8, 2026 00:48

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch 2 times, most recently from 3e0d62f to 4b95150 Compare April 1, 2026 00:57

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch from 4b95150 to 40a28f1 Compare April 8, 2026 00:52

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch 2 times, most recently from 85f7052 to c4f4579 Compare April 19, 2026 00:59

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch from c4f4579 to 37b26b9 Compare April 26, 2026 01:01

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch from 37b26b9 to ca4e25e Compare May 3, 2026 01:07

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch from ca4e25e to b5496fe Compare May 10, 2026 01:09

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch from b5496fe to a845574 Compare May 17, 2026 01:11

chore(deps): update loader dependencies major

f7682ea

| datasource | package | from | to | | ---------- | ------------ | ------ | ----- | | pypi | psutil | 6.1.1 | 7.2.2 | | pypi | transformers | 4.57.6 | 5.9.0 |

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch from a845574 to f7682ea Compare May 24, 2026 01:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(deps): update loader dependencies major (major)#194

chore(deps): update loader dependencies major (major)#194
dreadnode-renovate-bot[bot] wants to merge 1 commit into
mainfrom
renovate/major-loader-deps-major

dreadnode-renovate-bot Bot commented Feb 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

dreadnode-renovate-bot Bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release Notes

Release v5.9.0

New Model additions

Cohere2Moe

Parakeet tdt (#​44171)

HRM-Text

Breaking changes

Audio

Generation

Bugfixes and improvements

Significant community contributions

v5.8.1: Patch release v5.8.1

Patch release v5.8.1

v5.8.0: Release 5.8.0

Release v5.8.0

New Model additions

DeepSeek-V4

Gemma 4 Assistant

GraniteSpeechPlus

Granite4Vision

EXAONE-4.5

PP-FormulaNet

Breaking changes

Tokenization

Bugfixes and improvements

Significant community contributions

Release v5.7.0

New Model additions

Laguna

DEIMv2

Attention

Tokenizers

Generation

Kernels

Bugfixes and improvements

Configuration

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

dreadnode-renovate-bot Bot commented Feb 24, 2026 •

edited

Loading

Parakeet tdt (#44171)

`v5.8.1`: Patch release v5.8.1

`v5.8.0`: Release 5.8.0