Fix table data px dimension parsing bug by HrsLizo · Pull Request #1788 · py-pdf/fpdf2

HrsLizo · 2026-03-19T13:44:38Z

e.g. Fixes #0

Checklist:

A unit test is covering the code added / modified by this PR N/A
In case of a new feature, docstrings have been added, with also some documentation in the docs/ folder N/A
A mention of the change is present in CHANGELOG.md N/A
This PR is ready to be merged

Added a .replace("px", "") strip to safely handle pixel dimensions during HTML table parsing.

By submitting this pull request, I confirm that my contribution is made under the terms of the GNU LGPL 3.0 license.

andersonhc

Changes requested:

Fix indentation
Add a test in test/html/test_html.py exercising your change
Add a CHANGELOG entry

If you need clarification or assistance in any of these please let us know.

andersonhc · 2026-03-26T02:43:19Z

+            for dimension in ("width", "height"):
+            if dimension in self.td_th and self.td_th[dimension] is not None:
+                self.td_th[dimension] = self.td_th[dimension].replace("px", "").strip()


Suggested change

for dimension in ("width", "height"):

if dimension in self.td_th and self.td_th[dimension] is not None:

self.td_th[dimension] = self.td_th[dimension].replace("px", "").strip()

for dimension in ("width", "height"):

if dimension in self.td_th and self.td_th[dimension] is not None:

self.td_th[dimension] = self.td_th[dimension].replace("px", "").strip()

Pawansingh3889

Good catch on stripping from HTML table dimensions. Two observations:

Indentation issue: The and lines need to be indented one level deeper to be inside the loop:

As written, only the line runs in the loop, and the / runs once with set to (the last value).

Other units: HTML also supports , , , in dimension attributes. You might want to handle those too, or at minimum strip non-numeric characters:

Otherwise the approach is solid. The stripping is the most common case users will hit.

Pawansingh3889

Thanks for tackling this — I hit similar dimension parsing issues when generating PDF audit reports from HTML tables.

I notice an indentation issue in the diff: the if dimension in self.td_th line appears to be at the same indent level as the for loop rather than inside it. This would cause a syntax error or incorrect behavior — the if block needs to be indented one level deeper under the for dimension in ... loop.

Also worth considering:

Should this also handle other CSS units like em, rem, pt, %? At minimum a regex like re.sub(r"[a-z%]+$", "", value) would be more robust.
Edge case: what happens with width="auto"? The .replace("px", "") would pass it through unchanged and it may fail downstream when parsed as a number.

A test case for HTML like <td width="120px"> would help confirm the fix works end-to-end.

andersonhc · 2026-04-07T12:06:01Z

Hi @HrsLizo

Just checking in to see if you saw and the review and if you're still interested in completing this PR?

Fix table data px dimension parsing bug

0445e17

andersonhc requested changes Mar 26, 2026

View reviewed changes

Pawansingh3889 reviewed Apr 4, 2026

View reviewed changes

Pawansingh3889 reviewed Apr 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix table data px dimension parsing bug#1788

Fix table data px dimension parsing bug#1788
HrsLizo wants to merge 1 commit intopy-pdf:masterfrom
HrsLizo:fix-td-px-bug

HrsLizo commented Mar 19, 2026

Uh oh!

andersonhc left a comment

Uh oh!

andersonhc Mar 26, 2026

Uh oh!

Pawansingh3889 left a comment

Uh oh!

Pawansingh3889 left a comment

Uh oh!

andersonhc commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

HrsLizo commented Mar 19, 2026

Uh oh!

andersonhc left a comment

Choose a reason for hiding this comment

Uh oh!

andersonhc Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Pawansingh3889 left a comment

Choose a reason for hiding this comment

Uh oh!

Pawansingh3889 left a comment

Choose a reason for hiding this comment

Uh oh!

andersonhc commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants