This session focused on integrating advanced JavaScript obfuscation detection and conversion into js-beautify-rs. We've completed the research phase and begun implementation.
-
✅ Exhaustive Research (30+ techniques documented)
- VM-based obfuscation patterns
- Promise/async/generator patterns
- Proxy traps and Reflect API abuse
- Symbol/WeakMap/WeakSet usage
- Error-based and switch-true control flow
- JSON/template literal decoders
- Class features (private fields, static blocks)
- Optional chaining, nullish coalescing, BigInt
- Dynamic import, TextEncoder/TextDecoder, crypto.subtle
-
✅ Deobfuscator Tools Identified (10+ recent tools)
- webcrack (2K stars) - Obfuscator.io patterns
- wakaru (1.5K stars) - Unpacker, decompiler
- humanify (3.1K stars) - LLM-based renaming
- JSIMPLIFIER (NDSS 2026) - 20 techniques
- DeCoda (arXiv 2025) - LLM + graph learning
-
✅ Academic Papers Located (7 papers)
- NDSS 2026: "From Obfuscated to Obvious"
- CCS 2025: "JsDeObsBench"
- arXiv 2025: "Breaking Obfuscation: Cluster-Aware Graph"
- WWW 2024, DSN 2021, USENIX 2021
-
✅ Malware Analysis Blogs (8+ posts)
- Unit 42 (PaloAlto): JSFireTruck, DarkCloud Stealer
- Forcepoint X-Labs: Q3 2025 threat brief
- Wiz.io: JSFireTruck campaign analysis
-
✅ switch(true) Converter Integration
- Module created and integrated into Phase 16.5
- Stub implementation ready for full conversion
- Compiles successfully, tests passing
1. JSON.parse() Evaluation
- Pattern:
JSON.parse('{"key":"value"}') - Risk: LOW (deterministic, no side effects)
- Frequency: MEDIUM (common in modern obfuscators)
- Implementation: ~200 lines
- Tests: 5-10 cases
2. Template Literal Tag Functions
- Pattern:
String.raw`\x48\x65\x6c\x6c\x6f` - Risk: LOW (deterministic)
- Frequency: LOW (emerging pattern)
- Implementation: ~150 lines
- Tests: 5-8 cases
3. Optional Chaining Normalization
- Pattern:
obj?.prop?.method?.() - Risk: LOW (syntactic transformation)
- Frequency: HIGH (modern JavaScript)
- Implementation: ~200 lines
- Tests: 8-12 cases
4. Nullish Coalescing Chains
- Pattern:
a ?? b ?? c ?? default - Risk: LOW (syntactic transformation)
- Frequency: MEDIUM (modern JavaScript)
- Implementation: ~150 lines
- Tests: 6-10 cases
5. Promise Chain Normalization
- Pattern:
.then(x => x).then(y => y).catch(e => e) - Risk: MEDIUM (requires data-flow analysis)
- Frequency: MEDIUM (async obfuscation)
- Implementation: ~300 lines
- Tests: 8-15 cases
6. Generator Function Unrolling
- Pattern:
function* gen() { yield x; yield y; } - Risk: MEDIUM (requires control-flow graph)
- Frequency: LOW (uncommon in obfuscators)
- Implementation: ~250 lines
- Tests: 5-10 cases
7. Try-Catch State Machine Extraction
- Pattern:
try { ... } catch(e) { ... }chains - Risk: MEDIUM (requires semantic analysis)
- Frequency: MEDIUM (Akamai BMP pattern)
- Implementation: ~350 lines
- Tests: 8-12 cases
8. Reflect API Normalization
- Pattern:
Reflect.apply(fn, this, [args]) - Risk: MEDIUM (requires function resolution)
- Frequency: MEDIUM (Akamai BMP pattern)
- Implementation: ~200 lines
- Tests: 6-10 cases
9. Symbol Key Renaming
- Pattern:
const sym = Symbol('key'); obj[sym] = value - Risk: MEDIUM (requires symbol tracking)
- Frequency: LOW (uncommon in obfuscators)
- Implementation: ~200 lines
- Tests: 5-8 cases
10. WeakMap/WeakSet Conversion
- Pattern:
const wm = new WeakMap(); wm.set(key, value) - Risk: MEDIUM (semantic change)
- Frequency: LOW (uncommon in obfuscators)
- Implementation: ~250 lines
- Tests: 5-8 cases
┌─────────────────────────────────────────────────────────────┐
│ PRIORITY MATRIX: Impact vs Risk │
├─────────────────────────────────────────────────────────────┤
│ HIGH IMPACT, LOW RISK (DO FIRST): │
│ • JSON.parse() evaluation │
│ • Optional chaining normalization │
│ • Nullish coalescing chains │
│ • Template literal tags │
│ │
│ MEDIUM IMPACT, MEDIUM RISK (DO SECOND): │
│ • Promise chain normalization │
│ • Reflect API normalization │
│ • Try-catch state machine extraction │
│ • Generator function unrolling │
│ │
│ LOW IMPACT, HIGH RISK (SKIP): │
│ • VM bytecode decompilation (requires ISA reverse-eng) │
│ • Dynamic import() (can't be statically resolved) │
│ • new Function() (security risk) │
│ • crypto.subtle (can't be statically resolved) │
│ • Math.random() seeding (can't be statically resolved) │
│ • Date-based obfuscation (can't be statically resolved) │
└─────────────────────────────────────────────────────────────┘
- Each module: 5-15 test cases
- Coverage: obfuscated + deobfuscated + edge cases
- Pattern:
test_convert_*,test_preserve_*
- Real-world samples from:
- JSFireTruck malware (269K infected sites)
- DarkCloud Stealer samples
- SLOW#TEMPEST samples
- JSIMPLIFIER dataset (44K samples)
- Ensure existing 43 modules still work
- Run full test suite after each addition
- Benchmark performance on 11MB+ bundles
Phase 0: Encrypted eval decryption (pre-AST)
Phase 0.5: Akamai BMP detection & removal
Phase 1: Control flow unflattening
Phase 2: String array rotation
Phase 3: Decoder/string array/dispatcher inlining
Phase 4: Call proxy inlining
Phase 5: Operator proxy inlining
Phase 6: Expression simplification
Phase 7: Dead code elimination
Phase 8: Dead variable elimination
Phase 9: Function inlining
Phase 10: Array unpacking / dynamic property / ternary / try-catch
Phase 11: Unicode / boolean / void / object sparsing
Phase 12: Variable renaming
Phase 13: Empty statement cleanup
Phase 14: Sequence expression splitting
Phase 15: Multi-variable splitting
Phase 16: Ternary to if/else
Phase 16.5: switch(true) to if/else [NEW]
Phase 17: Short-circuit to if
Phase 18: IIFE unwrapping
Phase 19: esbuild helper detection
Phase 20: Module annotation
Phase 16.5: switch(true) to if/else [DONE - STUB]
Phase 16.7: JSON.parse() evaluation [TODO]
Phase 16.9: Optional chaining normalization [TODO]
Phase 17.1: Nullish coalescing simplification [TODO]
Phase 17.3: Template literal tag evaluation [TODO]
Phase 17.5: Promise chain normalization [TODO]
Phase 17.7: Generator unrolling [TODO]
Phase 17.9: Reflect API normalization [TODO]
Phase 18.1: Symbol key renaming [TODO]
Phase 18.3: WeakMap/WeakSet conversion [TODO]
Phase 18.5: Try-catch state machine extraction [TODO]
- JSON.parse() evaluation (deterministic)
- Optional chaining normalization (syntactic)
- Nullish coalescing chains (syntactic)
- Template literal tags (deterministic if pure)
- Promise chain normalization (may have side effects)
- Reflect API normalization (may have side effects)
- Generator unrolling (changes semantics)
- Try-catch state machine (may have side effects)
- WeakMap/WeakSet conversion (changes GC semantics)
- Private field conversion (changes scoping)
- Symbol key conversion (changes identity)
- VM bytecode decompilation (requires ISA knowledge)
- 11MB+ bundles: ~2-5 seconds
- 20-phase pipeline: ~100-200ms per phase
- Memory: ~500MB for large bundles
- JSON.parse() evaluation: +10-20ms (low overhead)
- Optional chaining: +5-10ms (syntactic)
- Promise chains: +50-100ms (requires data-flow)
- Reflect API: +30-50ms (requires symbol resolution)
- Parallel phase execution (phases 16.7, 16.9, 17.1 are independent)
- Memoization of decoded values
- Early termination if no patterns found
-
Module Documentation
- Pattern description with examples
- Obfuscated vs deobfuscated code
- AST signature for pattern matching
- Tools that produce this pattern
- Risk level and limitations
-
Test Documentation
- Test cases with expected outputs
- Edge cases and boundary conditions
- Real-world examples from malware
-
Integration Documentation
- Phase number and ordering
- Dependencies on other phases
- Performance characteristics
- Known limitations
- ✅ All 4 modules compile without warnings
- ✅ 15+ unit tests passing
- ✅ 5+ integration tests with real samples
- ✅ Performance: <50ms overhead on 11MB bundle
- ✅ Documentation complete
- ✅ All 3 modules compile without warnings
- ✅ 25+ unit tests passing
- ✅ 10+ integration tests with real samples
- ✅ Performance: <150ms overhead on 11MB bundle
- ✅ Documentation complete
- ✅ All 3 modules compile without warnings
- ✅ 20+ unit tests passing
- ✅ 8+ integration tests with real samples
- ✅ Performance: <100ms overhead on 11MB bundle
- ✅ Documentation complete
- Phase 1 (JSON.parse + Optional Chaining + Nullish Coalescing): 2-3 hours
- Phase 2 (Promise + Reflect + Try-Catch): 4-5 hours
- Phase 3 (Generator + Symbol + WeakMap): 3-4 hours
- Testing & Validation: 2-3 hours
- Documentation: 1-2 hours
Total: 12-17 hours of implementation work
- NDSS 2026: "From Obfuscated to Obvious" (JSIMPLIFIER)
- CCS 2025: "JsDeObsBench" (LLM evaluation)
- arXiv 2025: "Breaking Obfuscation: Cluster-Aware Graph" (DeCoda)
- Unit 42: JSFireTruck campaign (2025)
- Unit 42: DarkCloud Stealer (2025)
- Wiz.io: JSFireTruck analysis
- webcrack: https://github.com/j4k0xb/webcrack
- humanify: https://github.com/jehna/humanify
- JSIMPLIFIER: Academic tool (NDSS 2026)
- Implement JSON.parse() evaluation module
- Implement optional chaining normalization
- Implement nullish coalescing simplification
- Add 15+ unit tests
- Test with real-world samples
- Commit and document progress
- Run full test suite (should pass all 353+ tests)
- Benchmark performance
- Create PR with implementation notes