Sub-quadratic bignum integer-to-string by ericmj · Pull Request #11074 · erlang/otp

ericmj · 2026-04-29T12:45:28Z

Render-side integer_to_list/1 and integer_to_binary/1 of large bignums were O(N²) in the number of decimal digits, exposing a DoS surface for any library that converts attacker-controlled bignums to strings. This rewrites the bignum render path with three layered optimizations.

Algorithms

Divide-and-conquer wrapper around write_big. Splits an N-digit bignum at base^(N/2) via a single bignum divmod, recurses on each half, and writes the halves into adjacent positions in the output buffer (low half zero-padded). Below WRITE_BIG_DC_THRESHOLD (250 decimal digits, picked from a sweep) the original schoolbook single-digit-extraction loop is used. A power-of-base cache is built once per top-level call and shared across the recursion.
Burnikel-Ziegler recursive division. The D&C wrapper above is bounded by the cost of bignum divmod. With OTP's existing Knuth-D I_div (O(xl·yl)), the wrapper stays O(N²) — only a constant-factor improvement. BZ replaces I_div for divisors above BZ_DIV_THRESHOLD=8 ErtsDigits, giving O(M(n)·log n) per divmod (sub-quadratic with the existing Karatsuba multiplication). Wired through I_div_dispatch so big_div_rem, big_div, and the render D&C all benefit. One subtle dependency: I_mul_karatsuba assumes normalized inputs (single zero digit, or top digit non-zero), which BZ's recursive Q sometimes isn't — bz_div_3n_2n trims Q and B2 before each multiplication.
Barrett reciprocals for the power cache. The render D&C divides by the same cached base^k divisors many times across the recursion tree. Pre-computing mu_i = floor(beta^(2*sizes[i] + 1) / vals[i]) once per cache level (skipped for levels under BARRETT_LEVEL_THRESHOLD=100 ErtsDigits where mu-build cost wouldn't pay back) lets each divmod become a multiplication plus a small correction loop — saves a log-N factor per call vs BZ. The +1 ErtsDigit of precision in mu keeps the correction within 2 iterations even though base^k isn't normalized (top bit set).

References: Brent & Zimmermann Modern Computer Arithmetic; Burnikel & Ziegler MPI-I-98-1-022 (1998); CPython _pylong; V8 src/bigint.

Benchmarks (single-process min µs, macOS aarch64, 64-bit ErtsDigit)

integer_to_list/1 and integer_to_binary/1:

Decimal digits	Baseline (µs)	Final (µs)	Speedup
100	0	0	—
1 000	12	11	1.1×
10 000	1 923	512	3.8×
100 000	204 451	17 742	11.5×
300 000	1 934 852	131 916	14.7×
1 000 000	21 989 158	571 683	38.5×

Asymptotic class fit (100k → 1M, 10× input):
Baseline: 107.5× → slope 2.03 → O(N²)
Final: 32.7× → slope 1.51 → sub-quadratic, near M(N) bound

integer_to_list of a 1M-digit bignum: 22.0 s → 0.57 s.

Parse-side binary_to_integer/1 and list_to_integer/1 are unchanged; the existing Erlang-level segmentize+pairwise-combine in big_binary_to_int was already Karatsuba-bound (slope 1.55 across the full curve) and a C port did not move the numbers.

Render-side `integer_to_list/1` and `integer_to_binary/1` of large bignums were O(N²) in the number of decimal digits, exposing a DoS surface for any library that converts attacker-controlled bignums to strings (cf. the Elixir `decimal` advisory GHSA-rhv4-8758-jx7v). This rewrites the bignum render path with three layered optimizations. ## Algorithms 1. **Divide-and-conquer wrapper around `write_big`.** Splits an N-digit bignum at `base^(N/2)` via a single bignum divmod, recurses on each half, and writes the halves into adjacent positions in the output buffer (low half zero-padded). Below WRITE_BIG_DC_THRESHOLD (250 decimal digits, picked from a sweep) the original schoolbook single-digit-extraction loop is used. A power-of-base cache is built once per top-level call and shared across the recursion. 2. **Burnikel-Ziegler recursive division.** The D&C wrapper above is bounded by the cost of bignum divmod. With OTP's existing Knuth-D I_div (O(xl·yl)), the wrapper stays O(N²) — only a constant-factor improvement. BZ replaces I_div for divisors above BZ_DIV_THRESHOLD=8 ErtsDigits, giving O(M(n)·log n) per divmod (sub-quadratic with the existing Karatsuba multiplication). Wired through I_div_dispatch so big_div_rem, big_div, and the render D&C all benefit. One subtle dependency: I_mul_karatsuba assumes normalized inputs (single zero digit, or top digit non-zero), which BZ's recursive Q sometimes isn't — bz_div_3n_2n trims Q and B2 before each multiplication. 3. **Barrett reciprocals for the power cache.** The render D&C divides by the same cached `base^k` divisors many times across the recursion tree. Pre-computing `mu_i = floor(beta^(2*sizes[i] + 1) / vals[i])` once per cache level (skipped for levels under BARRETT_LEVEL_THRESHOLD=100 ErtsDigits where mu-build cost wouldn't pay back) lets each divmod become a multiplication plus a small correction loop — saves a log-N factor per call vs BZ. The +1 ErtsDigit of precision in mu keeps the correction within 2 iterations even though `base^k` isn't normalized (top bit set). References: Brent & Zimmermann *Modern Computer Arithmetic*; Burnikel & Ziegler MPI-I-98-1-022 (1998); CPython _pylong; V8 src/bigint. ## Benchmarks (single-process min µs, macOS aarch64, 64-bit ErtsDigit) `integer_to_list/1` and `integer_to_binary/1`: Decimal digits | Baseline (µs) | Final (µs) | Speedup ---------------|----------------|------------|-------- 100 | 0 | 0 | — 1 000 | 12 | 11 | 1.1× 10 000 | 1 923 | 512 | 3.8× 100 000 | 204 451 | 17 742 | 11.5× 300 000 | 1 934 852 | 131 916 | 14.7× 1 000 000 | 21 989 158 | 571 683 | 38.5× Asymptotic class fit (100k → 1M, 10× input): Baseline: 107.5× → slope 2.03 → O(N²) Final: 32.7× → slope 1.51 → sub-quadratic, near M(N) bound `integer_to_list` of a 1M-digit bignum: 22.0 s → 0.57 s. Parse-side `binary_to_integer/1` and `list_to_integer/1` are unchanged; the existing Erlang-level segmentize+pairwise-combine in big_binary_to_int was already Karatsuba-bound (slope 1.55 across the full curve) and a C port did not move the numbers. ## Tests - All existing OTP `num_bif_SUITE` and `big_SUITE` cases pass on the opt and asan builds. - The patch was developed iteratively under AddressSanitizer; the BZ recursion's I_mul_karatsuba normalization assumption was found by ASan via a heap-buffer-overflow read in I_mul_karatsuba's I_sub (big.c:991) caused by Q with leading-zero cells. ## Review follow-ups Follow-ups from review of the sub-quadratic integer-to-string rewrite: * Document I_mul_karatsuba's normalized-input precondition and assert it at entry, so future callers that pass zero-padded digits fail loudly rather than reading one cell past the internal scratch. * Size barrett_divmod's prod buffer with BARRETT_MAX_CORRECTIONS slack to eliminate a latent off-by-one if the correction loop's D_add ever carries into a new cell, and assert the bound before each carry. * Lift BARRETT_LEVEL_THRESHOLD next to the other tuning knobs, and drop the unused BZ_DEBUG/BZ_TRACE macros and orig_x_padded variable. * Add t_integer_to_string_large, round-tripping integers across each threshold boundary (WRITE_BIG_DC_THRESHOLD, BARRETT_LEVEL_THRESHOLD, deep cache levels) in bases 2/8/10/16/36, including negatives and a power-of-10 to exercise the high-half-zero split branch.

github-actions · 2026-04-29T12:46:30Z

CT Test Results

3 files 136 suites 50m 49s ⏱️
1 677 tests 1 620 ✅ 57 💤 0 ❌
2 319 runs 2 244 ✅ 75 💤 0 ❌

Results for commit c74f81f.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

- big.c: move declarations above early return in barrett_divmod to satisfy -Werror=declaration-after-statement; drop unused 'p' in write_big_dc_top - big.c, num_bif_SUITE.erl: convert tabs to spaces on new lines - license-header.es: allowlist big_SUITE_data/karatsuba.dat

bjorng · 2026-04-29T14:14:03Z

Thanks! Looks good to me after a very quick glance. It is too late to include in Erlang/OTP 29.0, so we will aim to include it in a patch release for OTP 29.

bjorng self-assigned this Apr 29, 2026

bjorng added the team:VM Assigned to OTP team VM label Apr 29, 2026

ericmj marked this pull request as ready for review April 29, 2026 15:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sub-quadratic bignum integer-to-string#11074

Sub-quadratic bignum integer-to-string#11074
ericmj wants to merge 2 commits into
erlang:masterfrom
ericmj:ericmj/bignum-render-dc-bz-barrett

ericmj commented Apr 29, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

bjorng commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ericmj commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Algorithms

Benchmarks (single-process min µs, macOS aarch64, 64-bit ErtsDigit)

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CT Test Results

Artifacts

Uh oh!

bjorng commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ericmj commented Apr 29, 2026 •

edited

Loading

github-actions Bot commented Apr 29, 2026 •

edited

Loading