Honor the model's generation_config.eos_token_id in the transformers backend by waqaskhan137 · Pull Request #1279 · huggingface/lighteval

waqaskhan137 · 2026-07-02T02:49:51Z

What

For generative tasks, _generate_padded's generation_config.update(...) sets eos_token_id=self.tokenizer.eos_token_id, overriding the terminators the model itself declares in generation_config.eos_token_id. Chat models whose turn terminator is not the tokenizer's eos never stop: they emit their turn-end token, it is ignored, and every generation runs to max_new_tokens.

Concrete case in the issue: Gemma ends chat turns with token 106 while tokenizer.eos_token is <eos> (id 1); the model declares generation_config.eos_token_id = [1, 106, 50]. With the override, an MMLU chain-of-thought task padded every generation to the cap, with 63-95% of returned tokens being token 106 repeated.

Fix

Prefer self.model.generation_config.eos_token_id when set, falling back to self.tokenizer.eos_token_id (one line, plus a comment). Models whose declared eos already equals the tokenizer's are unaffected.

Validation

Measured on google/gemma-4-E2B-it, MMLU CoT item, RTX PRO 6000, bf16 (details in #1278):

	before	after
tokens_generated	7168 (= cap)	2654 (stops on its own)
token-106 padding	6774	1
latency / item	145 s	55 s
extracted answer	correct	correct (unchanged)

ruff check and ruff format --check pass on the touched file.

…backend For generative tasks the transformers backend overrode eos_token_id with tokenizer.eos_token_id, discarding the terminators the model itself declares. Chat models whose turn terminator is not the tokenizer eos (e.g. Gemma ends turns with token 106 while tokenizer.eos is 1, generation_config declares [1, 106, 50]) therefore never stopped and padded every generation to max_new_tokens, wasting up to ~95% of generated tokens and corrupting generation-length measurements. Prefer model.generation_config.eos_token_id when set, falling back to the tokenizer's. Measured on Gemma MMLU CoT: 7168 -> 2654 tokens, 145s -> 55s per item, extracted answer unchanged. Fixes huggingface#1278

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Honor the model's generation_config.eos_token_id in the transformers backend#1279

Honor the model's generation_config.eos_token_id in the transformers backend#1279
waqaskhan137 wants to merge 1 commit into
huggingface:mainfrom
waqaskhan137:fix-eos-token-id-chat-models

waqaskhan137 commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

waqaskhan137 commented Jul 2, 2026

What

Fix

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant