Skip to content

Improvements for PR #100 - ignore_metadata for assert_approx_df_equality#184

Open
alexott wants to merge 2 commits into
MrPowers:mainfrom
alexott:pr-100-improvements
Open

Improvements for PR #100 - ignore_metadata for assert_approx_df_equality#184
alexott wants to merge 2 commits into
MrPowers:mainfrom
alexott:pr-100-improvements

Conversation

@alexott
Copy link
Copy Markdown
Contributor

@alexott alexott commented Apr 12, 2026

PR Checklist

  • A description of the changes is added to the description of this PR.
  • If there is a related issue, make sure it is linked to this PR.
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added or modified a feature, documentation in docs is updated

Description of changes

The #100 started to add support for ignore_metadata in assert_approx_df_equality, but most of the work was already merged in #182. This PR fixes the missing piece in the assert_approx_df_equality implementation

…df_equality

The MrPowers#100 started to add support for `ignore_metadata` in `assert_approx_df_equality`, but
most of the work was already merged in MrPowers#182.  This PR fixes the missing piece in the
`assert_approx_df_equality` implementation
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes support for the ignore_metadata option in assert_approx_df_equality, aligning approximate DataFrame comparisons with existing schema-comparison capabilities (notably the schema comparer updates introduced in earlier work).

Changes:

  • Add an ignore_metadata flag to assert_approx_df_equality.
  • Pass ignore_metadata through to assert_schema_equality during approx comparisons.
  • Add a unit test verifying metadata differences can be ignored for approx DataFrame equality.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
chispa/dataframe_comparer.py Wires ignore_metadata into assert_approx_df_equality by threading it into schema equality checks.
tests/test_dataframe_comparer.py Adds a regression test ensuring approx equality can ignore schema metadata differences.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 118 to 130
def assert_approx_df_equality(
df1: DataFrame,
df2: DataFrame,
precision: float,
ignore_nullable: bool = False,
transforms: list[Callable] | None = None, # type: ignore[type-arg]
allow_nan_equality: bool = False,
ignore_column_order: bool = False,
ignore_row_order: bool = False,
ignore_columns: list[str] | None = None,
ignore_metadata: bool = False,
formats: FormattingConfig | None = None,
) -> None:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by moving ignore_metadata after the existing positional tail in both DataFrame assertion functions and the Chispa.assert_df_equality wrapper. Added regression coverage for legacy positional formats calls so the old call order stays compatible.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this - this change will make it compatible with https://github.com/MrPowers/chispa/blob/main/chispa/dataframe_comparer.py#L69 and https://github.com/MrPowers/chispa/blob/main/chispa/__init__.py#L43, although we need to unify it to put ignore_metadata before ignore_columns

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say the back-compatibility is the first priority. It may be a good idea in theory to prepare and make something like 1.0 release, but due to lack of maintenance I'm against it now.

@SemyonSinchenko
Copy link
Copy Markdown
Collaborator

@alexott sorry, I lost this one. Could you tell me please, what is the status? I'm kinda agreed with Copilot about breaking nature of this change. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants