Skip to content

decompiler: add missing ccall condition rewrites + simplifier passes + 8 codegen bug fixes#6182

Closed
zardus wants to merge 14 commits into
masterfrom
decompiler-ccop-handlers
Closed

decompiler: add missing ccall condition rewrites + simplifier passes + 8 codegen bug fixes#6182
zardus wants to merge 14 commits into
masterfrom
decompiler-ccop-handlers

Conversation

@zardus
Copy link
Copy Markdown
Member

@zardus zardus commented Feb 25, 2026

(AI COMMENT)

Summary

  • Add ~60 new ccall condition code rewriters for AMD64, x86, ARM, and ARM64
  • Add 3 new optimization passes (OverflowBuiltinSimplifier, OverflowBuiltinPredicateSimplifier, CarryFlagSimplifier)
  • Fix 9 bugs in the C code generator and ccall rewriters found by recompilability testing and corpus analysis
  • Add comprehensive test suites: unit tests with Z3 semantic checking, ccop_triggers decompilation tests, and a recompilability round-trip test

Companion PR

Requires angr/binaries#163 (ccop_triggers test binaries).

What changed

New ccall condition rewrites

The ccall rewriter translates VEX helper calls (amd64g_calculate_condition, etc.) into C-level comparisons. Many condition/operation combinations were unhandled and fell through to raw _ccall(calculate_condition, ...) in the decompiled output.

New coverage includes:

  • AMD64: CondB/CondNB for ADD/ADC, CondBE/CondNBE for SUB/ADD, CondL/CondNL/CondLE/CondNLE for SUB/ADD, CondS/CondNS for logic/inc/dec, CondO/CondNO for ADD/SMUL, rflags_c for SUB/DEC, all SBB condition codes
  • x86: Mirror of AMD64 additions adapted for 32-bit, plus _fix_size for sub-width operations
  • ARM: CondMI/CondPL, CondVS/CondVC, CondHI/CondLS
  • ARM64: Complete new rewriter covering all standard AArch64 condition codes

New optimization passes

  1. OverflowBuiltinSimplifier: Rewrites paired overflow-check + conditional patterns (__OFADD__ followed by if-then) into __builtin_add_overflow / __builtin_sub_overflow
  2. OverflowBuiltinPredicateSimplifier: Rewrites standalone overflow predicates in conditions into __builtin_add_overflow_p / __builtin_mul_overflow_p
  3. CarryFlagSimplifier: Rewrites __CFADD__(a, b) != 0 into __builtin_add_overflow_p(a, b, (type)0)

All registered in fast and full presets. Disabling via simplifier blacklist restores IDA-style macro output.

Bugs fixed in existing code

These were found by the recompilability round-trip test, which decompiles ccop_trigger functions, recompiles with GCC, and compares outputs against the originals, plus corpus decompilation analysis.

# Bug Functions affected Pre-existing?
1 SBB condition codes completely unhandled 29 Yes
2 Logic ops lose 2nd arg (CC recovery) 27 Yes (xfailed)
3 _fix_size() always emits unsigned Convert nodes, breaking signed conditions 57 Yes
4 Missing narrow-type truncation cast for 8/16-bit null comparisons (C integer promotion) 25 Yes
5 __builtin_*_overflow_p third arg type mismatch for non-32-bit widths 20 No (introduced by new simplifier)
6 Operator precedence: !x + y instead of !(x + y) 13 Yes
7 rflagsc_sub inverted borrow, rflagsc_dec un-simplifiable, condition_processor __neg__ mapped to Not instead of Neg 4 Yes
8 Narrowing casts not absorbed into variable declarations in comparisons corpus-wide Yes
9 No-op signedness casts on char/short leaf operands in comparisons corpus-wide Yes

Bug fix details

Bug 6 — operator precedence: When rendering CmpEQ(Add(a,b), 0) as !a + b, the code used binary CmpEQ precedence to decide about parens, but the effective operator is unary ! (higher precedence). Fix: always parenthesize compound LHS in !expr shorthand.

Bug 3 — signedness: _fix_size() unconditionally set is_signed=False on narrowing Convert nodes. Fix: _fix_size() gains a signed= parameter threaded from condition semantics; CBinaryOp gains _cmp_signedness_cast() for function parameter casts; _propagate_cmp_signedness() updates local variable types directly.

Bug 4 — narrow truncation: C promotes char/short to int before arithmetic, so (unsigned char)0x80 + (unsigned char)0x80 = 0x100 (non-zero) instead of wrapping to 0. Fix: emit explicit truncation cast for sub-32-bit null comparisons.

Bug 5 — overflow_p types: __builtin_add_overflow_p(a, b, 0) uses the third arg's type for width. Bare 0 is int, wrong for 8/64-bit. Fix: thread overflow_signed tags from ccall rewriter through simplifier to codegen, inject explicit type casts.

Bug 7 — rflagsc: SUB handler created CmpLT without bits=1, breaking branch polarity. DEC handler used opaque Shr(And(...)) instead of ndep & 1. condition_processor.__neg__ was mapped to "Not" instead of "Neg".

Bug 1 — SBB: No handlers existed. New _sbb_prep() recovers operands from VEX encoding (dep_1=argL, dep_2=argR^oldCF, ndep=old_flags), plus handlers for CondZ/NZ, CondL, CondS/NS, CondNBE, CondBE, CondNB.

Bug 8 — narrowing cast absorption: When the ccall rewriter produces narrow-width comparisons like CmpLTs(Convert(32→8, dep1), Convert(32→8, dep2)), the C codegen rendered them as (char)v3 < (char)v4. New _try_narrow_cmp_operand() method detects narrowing CTypeCast nodes wrapping local CVariables, propagates the narrow type to the variable declaration (so unsigned int v3 becomes char v3), and strips the cast. Function parameters and struct field accesses are left alone.

Bug 9 — no-op signedness casts: _cmp_signedness_cast() emits explicit casts when operand signedness disagrees with the comparison (e.g. unsigned char in a signed comparison). But SimTypeChar(signed=True) and SimTypeChar(signed=False) both render as "char" in C, so (char)v3 is a textual no-op when v3 is already char. Fix: suppress the cast when both type renderings are identical AND the operand is a simple leaf (CVariable, CVariableField, CConstant) not subject to C integer promotion. Compound expressions like (char)(a0 + a1) keep the cast since it's a real truncation from int-promoted width.

Also fixes (in existing ccall rewriter code, found by semantic tests)

  • AMD64 CondLE used wrong operand order for SUB
  • AMD64 rflags_c for ADC had inverted carry computation
  • ARM CondGE/CondLT had swapped signed comparison operators
  • VEX pc_actions_UMUL/pc_actions_SMUL extracted hi bits from already-truncated product (always zero)

Test results

Recompilability round-trip: 191 pass / 38 xfail / 11 skip / 0 fail

Remaining 38 xfails:

  • 27 logic-op arg count mismatches (CC recovery drops unused 2nd arg — acceptable)
  • 11 eight-bit char signedness issues (VEX inlines 8-bit ops at register width)

Test plan

  • test_ccall_rewriters.py — Z3 semantic equivalence for all rewriter condition/op combos
  • test_ccop_triggers.py — decompile ccop_trigger binaries, verify no raw ccalls
  • test_recompilability.py — round-trip decompile → recompile → compare semantics
  • test_optimization_passes.py — unit tests for all 3 new simplifier passes
  • Existing decompiler test suite passes (210 pass, 2 pre-existing failures)
  • CI green (Lint, Typecheck, all test shards, Decompiler Snapshot Testing)

🤖 Generated with Claude Code

zardus and others added 13 commits February 25, 2026 08:36
Add handlers for previously unsupported condition code / operation
combinations in the AMD64, x86, and ARM ccall rewriters. These are
the VEX helper calls (amd64g_calculate_condition, etc.) that the
decompiler must rewrite into C-level comparisons.

New coverage includes:
- AMD64: CondB/CondNB for ADD/ADC, CondBE/CondNBE for SUB/ADD,
  CondL/CondNL/CondLE/CondNLE for SUB/ADD, CondS/CondNS for
  logic/inc/dec, CondO/CondNO for ADD/SMUL, rflags_c for SUB/DEC
- x86: Mirror of AMD64 additions adapted for 32-bit ops, plus
  _fix_size for sub-width operations
- ARM: CondMI/CondPL (sign flag), CondVS/CondVC (overflow),
  CondHI/CondLS (unsigned >/<= via carry+zero)

Also fixes 4 bugs found by semantic testing:
- AMD64 CondLE used wrong operand order for SUB
- AMD64 rflags_c for ADC had inverted carry computation
- ARM CondGE/CondLT had swapped signed comparison operators
- VEX ccall helpers had incorrect flag extraction bitmasks

Includes comprehensive unit tests with Z3 semantic equivalence
checking for all rewriter condition/operation combinations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When the LLM refinement pass is enabled, the decompiler was creating
phantom unified variables that didn't correspond to any actual
variable in the function. This happened because the variable
unification step ran before LLM refinement but the variable list
wasn't updated after refinement removed some variables.

Fix by refreshing the unified variable set after LLM refinement
completes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add three new decompiler optimization passes that simplify VEX
flag-computation idioms into clean C equivalents:

1. OverflowBuiltinSimplifier: Rewrites paired overflow-check +
   conditional patterns (e.g., __OFADD__ followed by if-then) into
   GCC __builtin_add_overflow / __builtin_sub_overflow calls.

2. OverflowBuiltinPredicateSimplifier: Rewrites standalone overflow
   macro predicates (__OFADD__, __OFMUL__, etc.) that appear directly
   in conditions into __builtin_add_overflow_p / __builtin_mul_overflow_p.

3. CarryFlagSimplifier: Rewrites __CFADD__(a, b) != 0 patterns into
   the equivalent __builtin_add_overflow_p(a, b, (type)0), eliminating
   the IDA-style __CFADD__ macro from decompiled output.

All three passes are registered in both fast and full presets.
Disabling them via the decompiler's simplifier blacklist restores
the traditional IDA-style macro output for users who prefer it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a complete ARM64 (AArch64) ccall rewriter that translates VEX
arm64g_calculate_condition helper calls into C-level comparisons.

Covers all standard ARM64 condition codes:
- EQ/NE (zero flag)
- CS/CC (carry flag, unsigned >=/<)
- MI/PL (sign flag)
- VS/VC (overflow flag)
- HI/LS (unsigned >/<= via carry+zero)
- GE/LT/GT/LE (signed comparisons via sign^overflow)

Handles SUB, ADD, ADC, SBC, LOGIC, and shift operations.
Also fixes ARMHF ccall rewriter registration in __init__.py
(was missing the ARM rewriter import for 32-bit ARM targets).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a comprehensive test suite that compiles small C functions
designed to trigger specific condition code / operation combinations,
decompiles them, and verifies the output.

Includes:
- ccop_triggers source files (in angr-binaries repo) covering all
  AMD64/x86/ARM/ARM64 condition code operations: ADD, SUB, ADC,
  SBB, INC, DEC, logic ops, shifts, MUL, COPY, and rflags_c
- Parametrized pytest test that decompiles each function and checks
  for absence of raw ccall helpers in the output
- Z3-based semantic equivalence checking that verifies the decompiled
  C expression matches the VEX semantics for each condition
- ccop_report.py utility for generating coverage reports

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a test that decompiles each ccop_trigger function, recompiles the
decompiled C output with GCC, links it against the original compiled
function, and compares outputs across a range of inputs to verify
semantic equivalence.

This catches issues that unit tests miss: signedness bugs, operator
precedence errors, missing truncation casts due to C integer promotion
rules, and type mismatches in builtin overflow calls.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When the C code generator renders CmpEQ(expr, 0) as !expr (the
cstyle_null_cmp shorthand), it was using the *binary* CmpEQ precedence
(9) to decide whether the LHS needs parentheses.  Since CmpEQ has
lower precedence than Add (12), Mul (13), etc., a compound LHS like
Add(a, b) would never get wrapped:

    CmpEQ(Add(a, b), 0) → !a + b      (WRONG: means (!a) + b)

In C, unary ! (precedence 14) binds tighter than ANY binary operator,
so the correct output is !(a + b).

Fix: when emitting the ! prefix for the CmpEQ==0 shorthand, always
force parentheses around a CBinaryOp LHS regardless of precedence.

Fixes 13 ccop_trigger functions that had inverted condition semantics.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The ccall rewriter's _fix_size() helper narrows 64-bit VEX temporaries
to the operation's actual width (8/16/32-bit) by emitting Convert nodes.
It unconditionally set is_signed=False on every Convert, which caused
all narrowed operands to be typed as unsigned in the C output.

For signed condition codes (CondL, CondLE, CondNL, CondNLE, CondS,
CondNS, CondO, CondNO), the comparison operands must be signed so that
C's ordered comparison operators (<, <=, >, >=) use signed semantics.
When both operands are unsigned, C's "usual arithmetic conversions"
produce an unsigned comparison, which gives wrong results for negative
values.

The fix has three parts:

1. _fix_size() now accepts a signed= parameter (default False for
   backwards compat) and threads it to the Convert node.  All call
   sites under signed conditions pass signed=True.

2. CBinaryOp gains a _cmp_signed slot that records the AIL comparison's
   intended signedness.  A new _cmp_signedness_cast() method emits
   explicit C casts like (int) or (long long) when the C operand type
   disagrees with the comparison signedness — this handles function
   parameters whose types come from the ABI and cannot be changed.

3. _propagate_cmp_signedness() on the code generator updates LOCAL
   variable and constant types to match the comparison signedness
   directly in the variable manager.  This changes the variable
   declaration (e.g., "unsigned long long v1" → "long long v1") so
   no cast is needed in the comparison itself.  Function parameters
   are explicitly skipped — _cmp_signedness_cast handles those.

This two-pronged approach (type propagation for locals, explicit casts
for params) produces clean output for normal code while ensuring
correctness for ccop functions where signedness matters.

Also fixes SimTypeInt128.c_repr() to emit "__int128" / "unsigned __int128"
instead of the non-standard "int128_t", and adds 128-bit to the
default_simtype_from_bits mapping.

Fixes 57 ccop_trigger functions that had unsigned args for signed
conditions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When the C code generator renders CmpEQ(expr, 0) as !expr or
CmpNE(expr, 0) as just expr (the cstyle_null_cmp shorthand), the
result is wrong for 8-bit and 16-bit operations due to C's integer
promotion rules.

In C, arithmetic on char and short is implicitly promoted to int
before the operation.  So for 8-bit operands:

    unsigned char a = 0x80, b = 0x80;
    if (!(a + b))  // WRONG: int(0x80) + int(0x80) = 0x100, non-zero

The programmer's intent (and the AIL semantics) is to test the 8-bit
result, which wraps to 0.  The correct C is:

    if (!(unsigned char)(a + b))  // RIGHT: truncate to 8-bit first

Fix: when emitting the !expr or bare expr shorthand for CmpEQ/CmpNE
against zero, and the common type of the comparison is narrower than
32 bits (i.e., char or short), emit an explicit truncation cast around
the LHS expression.  This forces C to evaluate the narrow-width result
before the boolean test.

The cast type is derived from the comparison's common_type, so it
respects signedness (emitting "(char)" or "(unsigned char)" as
appropriate).

Fixes 25 ccop_trigger functions at 8-bit and 16-bit widths for
condz/condnz/conds/condns conditions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
__builtin_add_overflow_p(a, b, (type)0) uses the type of the third
argument to determine the width and signedness of the overflow check.
The decompiler was emitting bare 0 (which is int, i.e., signed 32-bit
in C), so:

- 8-bit overflow checks tested int overflow, not char overflow
- 16-bit overflow checks tested int overflow, not short overflow
- 64-bit overflow checks tested int overflow, not long long overflow
- Unsigned multiply (UMUL) overflow checks used signed semantics

The fix threads signedness information from the ccall rewriter through
to the C code generator via expression tags:

1. The ccall rewriter tags __OFADD__/__OFMUL__ calls with
   "overflow_signed": True/False based on whether the operation is
   ADD/SMUL (signed) or UMUL (unsigned).

2. The OverflowBuiltinPredicateSimplifier propagates this to the
   zero constant's tags as "overflow_p_signed".

3. MakeTypecastsImplicit.handle_CFunctionCall is rewritten to:
   - Skip prototype-based cast collapse for __builtin_*_overflow_p
     calls (which would strip the intentional type-conveying cast).
   - After processing, inject explicit casts on the third argument
     when its type differs from int: e.g., (unsigned char)0 for
     8-bit unsigned, (long long)0 for 64-bit signed.
   - Cast the first two operands when their signedness disagrees
     with the overflow check type (e.g., unsigned params need
     (int) casts for signed ADD overflow).

Fixes 20 ccop_trigger functions with overflow_p type mismatches.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs in the amd64g_calculate_rflags_c rewriter:

**SUB carry flag (borrow):**
The existing SUB handler created a CmpLT node without explicit bits=
or tags, producing a comparison result at the operand width rather
than 1-bit.  When this was wider than ccall.bits, the "if cf.bits ==
ccall.bits: return cf" early exit skipped the Convert, and the
comparison node (which evaluates to 0 or 1) was used directly as a
full-width integer.  This caused the structurer to assign the wrong
branch polarity when threading the carry flag through if/else.

Fix: explicitly create a 1-bit CmpLT with tags, and always wrap in
Convert to ccall.bits.  This matches the pattern used by the ADD
carry handler.

**DEC carry flag:**
DEC does not modify the carry flag — it preserves CF from the previous
operation, stored in ndep (the "old flags" VEX operand).  The old code
extracted CF via (ndep & G_CC_MASK_C) >> G_CC_SHIFT_C, which produced
an opaque Shr(And(...)) tree that the simplifier couldn't reduce.

Fix: since CF is bit 0 of the x86-64 RFLAGS register, simplify to
(ndep & 1) which the C codegen renders cleanly.

**condition_processor.py __neg__ mapping:**
The claripy __neg__ operation (arithmetic negation, -x) was incorrectly
mapped to the AIL "Not" operator (logical/bitwise NOT, !x or ~x).
This corrupted conditions derived from negated carry flags.  Fix: map
__neg__ to "Neg" (arithmetic negation) instead.

Fixes 4 ccop_trigger functions: rflagsc_sub_{32,64}, rflagsc_dec_{32,64}.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The AMD64 ccall rewriter had no handlers for the SBB (subtract with
borrow) VEX operation, causing raw _ccall(calculate_condition, ...)
expressions to appear in the decompiled output.  These don't compile
as C.

SBB is used for extended-precision subtraction (e.g., 128-bit subtract
via two 64-bit operations).  VEX encodes it as:

    dep_1 = argL (left operand)
    dep_2 = argR ^ oldCF (right operand XORed with old carry)
    ndep  = old RFLAGS (carries the previous carry flag in bit 0)

The actual computation is: result = dep_1 - (arg2 + carry), where
arg2 = dep_2 ^ carry and carry = ndep & 1.

A new _sbb_prep() helper extracts the carry flag, recovers the
original arg2, and computes the result at the correct narrow width.
This is shared by all SBB condition handlers.

New handlers for SBB operations:

- CondZ/CondNZ: result == 0 / result != 0
- CondL: signed less-than, using extended precision (double-width
  sign-extended comparison to avoid overflow in the subtraction)
- CondS/CondNS: sign flag of the result (result < 0 / result >= 0)
- CondNBE: unsigned above (!CF && !ZF), computed as no-borrow AND
  result-nonzero using extended precision for the borrow check
- CondBE: unsigned below-or-equal (CF || ZF), borrow OR result-zero
- CondNB: unsigned above-or-equal (!CF), no borrow via extended
  precision comparison

Also removes incorrect signed=True from the rflags_c ADD and SUB
handlers — carry flag computation is always an unsigned comparison,
regardless of the condition context.

Fixes 29 ccop_trigger functions that had unrewritten SBB ccalls.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update the recompilability test to reflect all bug fixes:
- Simplify _classify() to only xfail remaining 8-bit char signedness
  issues (down from 190 xfails to 38)
- Add 2-to-1 arg harness adapter for logic-op functions where the
  decompiler correctly optimizes away one argument
- Add decomp_nargs detection to count actual decompiled parameters
- Fix pylint warnings: add check=False to subprocess.run calls,
  encoding="utf-8" to open calls, narrow exception catching
- Fix pyright regressions: explicit None checks for _func_args and
  _variables_in_use in _propagate_cmp_signedness

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 25, 2026 19:47
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR significantly enhances the angr decompiler's ability to translate VEX intermediate representation condition code calculations into clean C code. It adds ~60 new ccall condition rewriters for AMD64, x86, ARM, and ARM64 architectures, introduces 3 optimization passes for overflow handling, and fixes 8 code generation bugs discovered through recompilability testing.

Changes:

  • Added comprehensive ccall rewriters for all major architectures (AMD64, x86, ARM32, ARM64)
  • Implemented OverflowBuiltinSimplifier, OverflowBuiltinPredicateSimplifier, and CarryFlagSimplifier optimization passes
  • Fixed 8 bugs in C code generation including signedness handling, operator precedence, and SBB support
  • Added extensive test infrastructure with 3 new test files covering unit tests, integration tests, and round-trip recompilability

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated no comments.

Show a summary per file
File Description
test_recompilability.py New 576-line round-trip test framework for verifying decompiled code can be recompiled
test_ccop_triggers.py New 873-line integration test for ccall rewriter coverage
test_optimization_passes.py Added 743 lines of unit tests for new optimization passes
test_decompiler_llm.py Fixed variable collection to only use codegen-visible variables
test_semvar_naming.py Updated regex to handle signedness casts in loop bounds
test_decompiler.py Updated test expectation for CFADD rewrite
overflow_builtin_simplifier.py New 377-line pass for OFADD/OFMUL pattern matching
overflow_builtin_p_simplifier.py New 159-line pass for standalone overflow predicates
carry_flag_simplifier.py New 141-line pass for CFADD rewriting
c.py Enhanced CBinaryOp with signedness tracking and narrow cast absorption (195 new lines)
amd64_ccalls.py Expanded from ~400 to ~1230 lines with comprehensive condition support
x86_ccalls.py Expanded from ~300 to ~920 lines mirroring AMD64 additions
arm_ccalls.py Added 170 lines of new condition handlers
arm64_ccalls.py New 496-line complete ARM64 rewriter
sim_type.py Fixed SimTypeInt128 to use __int128 instead of int128_t
ccall.py Fixed VEX semantics bugs in UMUL/SMUL and ARM64 SBC
condition_processor.py Fixed neg mapping from "Not" to "Neg"
decompiler.py Updated LLM functions to only process codegen-visible variables

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@angr-bot
Copy link
Copy Markdown
Member

Corpus decompilation diffs can be found at angr/dec-snapshots@master...angr/angr_6182

@zardus
Copy link
Copy Markdown
Member Author

zardus commented Feb 25, 2026

Corpus Decompilation Regression Analysis

Diff: angr/dec-snapshots@master...angr/angr_6182
Result: 0 regressions across 40 changed files.

Change breakdown

Category Files Description
Bug 3 (signedness) ~30 unsigned intint declarations; (int) casts on signed comparisons
Bug 4 (narrow truncation) ~20 !(char)(v0 & 1), (short)(flag+1) < (short)flag, (unsigned short)(v2 & 4095)
Bug 8 (narrowing absorption) ~6 unsigned intint declaration narrowing (overlaps with Bug 3)
Type name fix 2 uint128_tunsigned __int128
Struct reorder 1 Non-deterministic struct_1/struct_2 ordering

Bug 8 (new in this PR vs #6180) impact

Bug 8 (_try_narrow_cmp_operand) absorbs narrowing casts into variable declarations. In the corpus, its effect is conservative — all 6 affected files show unsigned intint narrowing, which heavily overlaps with Bug 3's signedness fix. No aggressive narrowings (e.g., intchar in declarations) were observed.

Notably, Music_Store_Client/402530 (cgc_purchaseSong) still shows (char)ptr->field_10 < (char)v3 with v3 remaining unsigned int — Bug 8 didn't fire here, likely because the if-condition's comparison path doesn't go through the standard _handle_Expr_BinaryOp ordered-comparison flow. This is a missed optimization, not a regression.

Flagged for scrutiny

BudgIT/402de0 (cgc_checkheap): Shows struct_2*/struct_0*unsigned int type degradation with loss of struct field access syntax (cur->field_10*((cur + 16))). This is not caused by this PR — it's a pre-existing type recovery instability triggered by non-deterministic struct ordering (visible in BudgIT/4034c0). The only PR-attributable change is the Bug 3 (int)cgc_sendline(...) cast, which is correct.

Comparison with PR #6180 baseline

PR #6180's corpus analysis found 42 files with the same Bug 3/4/type-name/struct-reorder patterns and 0 regressions. This PR (#6182) adds Bug 8 which manifests in ~6 files, all conservatively. The 2-file difference in count (40 vs 42) is within non-deterministic noise (struct reordering can cascade into different file sets).

🤖 Generated with Claude Code

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 25, 2026

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
1645 1 1644 52
View the full list of 1 ❄️ flaky test(s)
tests/procedures/libc/test_string.py::TestStringSimProcedures::test_strlen

Flake rate in main: 12.50% (Passed 7 times, Failed 1 times)

Stack Traces | 0.001s run time
worker 'gw0' crashed while running '.../procedures/libc/test_string.py::TestStringSimProcedures::test_strlen'

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

Two improvements to comparison rendering in the C code generator:

1. Absorb narrowing CTypeCast nodes into variable declarations. When the
   ccall rewriter produces narrow-width comparisons like
   CmpLTs(Convert(32->8, dep1), Convert(32->8, dep2)), the codegen
   rendered them as (char)v3 < (char)v4. New _try_narrow_cmp_operand()
   propagates the narrow type to the variable declaration (so
   unsigned int v3 becomes char v3) and strips the cast. Function
   parameters and struct field accesses are left alone.

2. Suppress no-op signedness casts on leaf operands. _cmp_signedness_cast
   emits explicit casts when operand signedness disagrees with the
   comparison, but SimTypeChar(signed=True) and SimTypeChar(signed=False)
   both render as "char" in C, making (char)x a no-op when x is already
   char. Only suppress when the operand is a simple leaf (CVariable,
   CVariableField, CConstant) not subject to C integer promotion; compound
   expressions like (char)(a0 + a1) keep the cast since it is a real
   truncation from int-promoted width.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@zardus zardus force-pushed the decompiler-ccop-handlers branch from da85c19 to 8d6de59 Compare February 25, 2026 21:28
@zardus
Copy link
Copy Markdown
Member Author

zardus commented Feb 25, 2026

Reopening as new PR to retrigger corpus analysis after squashing Bug 8+9 commits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants