Skip to content

Fix table data px dimension parsing bug#1788

Open
HrsLizo wants to merge 1 commit intopy-pdf:masterfrom
HrsLizo:fix-td-px-bug
Open

Fix table data px dimension parsing bug#1788
HrsLizo wants to merge 1 commit intopy-pdf:masterfrom
HrsLizo:fix-td-px-bug

Conversation

@HrsLizo
Copy link
Copy Markdown

@HrsLizo HrsLizo commented Mar 19, 2026

e.g. Fixes #0

Checklist:

  • A unit test is covering the code added / modified by this PR N/A

  • In case of a new feature, docstrings have been added, with also some documentation in the docs/ folder N/A

  • A mention of the change is present in CHANGELOG.md N/A

  • This PR is ready to be merged

Added a .replace("px", "") strip to safely handle pixel dimensions during HTML table parsing.

By submitting this pull request, I confirm that my contribution is made under the terms of the GNU LGPL 3.0 license.

Copy link
Copy Markdown
Collaborator

@andersonhc andersonhc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes requested:

  • Fix indentation
  • Add a test in test/html/test_html.py exercising your change
  • Add a CHANGELOG entry

If you need clarification or assistance in any of these please let us know.

Comment thread fpdf/html.py
Comment on lines +1128 to +1130
for dimension in ("width", "height"):
if dimension in self.td_th and self.td_th[dimension] is not None:
self.td_th[dimension] = self.td_th[dimension].replace("px", "").strip()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for dimension in ("width", "height"):
if dimension in self.td_th and self.td_th[dimension] is not None:
self.td_th[dimension] = self.td_th[dimension].replace("px", "").strip()
for dimension in ("width", "height"):
if dimension in self.td_th and self.td_th[dimension] is not None:
self.td_th[dimension] = self.td_th[dimension].replace("px", "").strip()

Copy link
Copy Markdown

@Pawansingh3889 Pawansingh3889 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on stripping from HTML table dimensions. Two observations:

  1. Indentation issue: The and lines need to be indented one level deeper to be inside the loop:

As written, only the line runs in the loop, and the / runs once with set to (the last value).

  1. Other units: HTML also supports , , , in dimension attributes. You might want to handle those too, or at minimum strip non-numeric characters:

Otherwise the approach is solid. The stripping is the most common case users will hit.

Copy link
Copy Markdown

@Pawansingh3889 Pawansingh3889 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this — I hit similar dimension parsing issues when generating PDF audit reports from HTML tables.

I notice an indentation issue in the diff: the if dimension in self.td_th line appears to be at the same indent level as the for loop rather than inside it. This would cause a syntax error or incorrect behavior — the if block needs to be indented one level deeper under the for dimension in ... loop.

Also worth considering:

  • Should this also handle other CSS units like em, rem, pt, %? At minimum a regex like re.sub(r"[a-z%]+$", "", value) would be more robust.
  • Edge case: what happens with width="auto"? The .replace("px", "") would pass it through unchanged and it may fail downstream when parsed as a number.

A test case for HTML like <td width="120px"> would help confirm the fix works end-to-end.

@andersonhc
Copy link
Copy Markdown
Collaborator

Hi @HrsLizo

Just checking in to see if you saw and the review and if you're still interested in completing this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants