Skip to content

src: skip duplicate UTF-8 validation in TextDecoder fatal path#63231

Open
mertcanaltin wants to merge 2 commits into
nodejs:mainfrom
mertcanaltin:mert/textdecoder-skip-utf8-revalidation
Open

src: skip duplicate UTF-8 validation in TextDecoder fatal path#63231
mertcanaltin wants to merge 2 commits into
nodejs:mainfrom
mertcanaltin:mert/textdecoder-skip-utf8-revalidation

Conversation

@mertcanaltin
Copy link
Copy Markdown
Member

@mertcanaltin mertcanaltin commented May 10, 2026

In TextDecoder's strict mode, the utf-8 buffer was being validated twice, I removed the second validation, non-asci decoding in strict mode is now 27–34% faster

Bench results (verry long, I wrote a gist file):

https://gist.github.com/mertcanaltin/42f2fe0808e85741bcc76e7f59a4229e

Short results:

util/text-decoder.js fatal=1 encoding='utf-8' content='one-byte-string'
  len=131072 ArrayBuffer       +31.14% ±1.90%
  len=131072 Buffer            +32.05% ±1.31%
  len=131072 SharedArrayBuffer +31.54% ±0.75%
  len=16384  ArrayBuffer       +28.06% ±1.73%
  len=16384  Buffer            +27.49% ±2.92%
  len=16384  SharedArrayBuffer +28.03% ±1.77%

util/text-decoder.js fatal=1 encoding='utf-8' content='two-byte-string'
  len=131072 ArrayBuffer       +33.67% ±1.17%
  len=131072 Buffer            +34.15% ±1.73%
  len=131072 SharedArrayBuffer +34.03% ±1.12%
  len=16384  ArrayBuffer       +31.13% ±1.71%
  len=16384  Buffer            +29.02% ±1.73%
  len=16384  SharedArrayBuffer +28.34% ±2.12%

util/text-decoder-stream.js unicode=1 fatal=1 encoding='utf-8'
  len=131072 Buffer            +11.66%
  len=131072 SharedArrayBuffer  +9.36%
  len=16384  Buffer             +8.08%
  len=16384  SharedArrayBuffer  +6.65%

@nodejs-github-bot
Copy link
Copy Markdown
Collaborator

Review requested:

  • @nodejs/performance

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels May 10, 2026
Signed-off-by: Mert Can Altin <mertgold60@gmail.com>
@mertcanaltin mertcanaltin force-pushed the mert/textdecoder-skip-utf8-revalidation branch from 1ea0c09 to 25fe9af Compare May 10, 2026 20:55
@codecov
Copy link
Copy Markdown

codecov Bot commented May 10, 2026

Codecov Report

❌ Patch coverage is 29.62963% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.03%. Comparing base (facd71e) to head (0df76ab).
⚠️ Report is 25 commits behind head on main.

Files with missing lines Patch % Lines
src/string_bytes.cc 25.00% 13 Missing and 5 partials ⚠️
src/encoding_binding.cc 66.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #63231      +/-   ##
==========================================
- Coverage   90.04%   90.03%   -0.01%     
==========================================
  Files         713      713              
  Lines      224950   224976      +26     
  Branches    42530    42542      +12     
==========================================
+ Hits       202548   202554       +6     
- Misses      14188    14191       +3     
- Partials     8214     8231      +17     
Files with missing lines Coverage Δ
src/string_bytes.h 80.00% <ø> (ø)
src/encoding_binding.cc 53.22% <66.66%> (+0.30%) ⬆️
src/string_bytes.cc 71.34% <25.00%> (-3.29%) ⬇️

... and 34 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@gurgunday
Copy link
Copy Markdown
Member

Cc @ChALkeR if you want to take a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants