Skip to content

Validate unique column names in pandas.#12199

Merged
trivialfis merged 3 commits intodmlc:masterfrom
trivialfis:check-df-names
May 10, 2026
Merged

Validate unique column names in pandas.#12199
trivialfis merged 3 commits intodmlc:masterfrom
trivialfis:check-df-names

Conversation

@trivialfis
Copy link
Copy Markdown
Member

No description provided.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an explicit validation step in the pandas DataFrame transformation path to reject duplicate column names early, preventing ambiguous feature handling when constructing XGBoost matrices from pandas inputs.

Changes:

  • Reject pandas DataFrames with duplicate data.columns in _transform_pandas_df.
  • Raise a ValueError when duplicates are detected.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread python-package/xgboost/data.py Outdated
Comment thread python-package/xgboost/data.py Outdated
Comment on lines +660 to +661
if data.columns.has_duplicates:
raise ValueError("Duplicate column names are not supported.")
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

trivialfis and others added 2 commits May 9, 2026 05:19
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@trivialfis trivialfis merged commit 61cb83d into dmlc:master May 10, 2026
81 of 82 checks passed
@trivialfis trivialfis deleted the check-df-names branch May 10, 2026 04:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants