Skip to content

fix: support 32KB page sizes in ESE parser for Windows Server 2025#2165

Open
juliosuas wants to merge 1 commit into
fortra:masterfrom
juliosuas:fix/ese-32kb-page-support
Open

fix: support 32KB page sizes in ESE parser for Windows Server 2025#2165
juliosuas wants to merge 1 commit into
fortra:masterfrom
juliosuas:fix/ese-32kb-page-support

Conversation

@juliosuas
Copy link
Copy Markdown

Summary

Fixes #1924secretsdump.py crashes with IndexError: bytearray index out of range when parsing NTDS.dit files from Windows Server 2025.

Root Cause

Windows Server 2025 uses 32KB (32768 byte) database pages in NTDS.dit, up from the previous 8KB (8192 byte) pages. The ESE format specification (revision 0x11+) defines a different page tag format for pages larger than 8KB:

  • Tag entries use 15-bit fields for both value size and offset (instead of 13-bit + 3-bit flags)
  • Page tag flags are moved into the upper 3 bits of the first 16-bit value in the entry data itself

The existing code in getTag() correctly handles this format except when valueSize is 0 (empty entries, such as leaf/branch page headers with no common key). In that case, tmpData is an empty bytearray and tmpData[1] raises IndexError.

The Fix

Added bounds checking before accessing tmpData[1] in the large-page code path:

valueSize Behavior
≥ 2 Extract flags from tmpData[1] >> 5 (existing logic, unchanged)
== 1 Set flags to 0, return single byte as-is
== 0 Set flags to 0, return empty bytes (was crashing)

The 8KB page code path (the else branch) is completely unchanged, preserving backward compatibility.

References

  • libyal ESE format specification — documents the page tag format change for 16KB/32KB pages
  • Database header from affected NTDS.dit: Version 0x620, Revision 0x122, Page Size 32768

Testing

  • Verified the fix handles all three value size cases (0, 1, ≥2)
  • The 8KB page path is untouched (no regression risk for existing databases)
  • Confirmed Python compilation succeeds with no syntax errors

@alexisbalbachan
Copy link
Copy Markdown
Collaborator

Hi, I’ve been reviewing this PR because it targets the same issue (#1924) as another one (#2158). The two PRs complement each other, but they address different parts of the problem:

  • Fix large-page ESE tag-state parsing for Windows Server 2025 NTDS.dit (issue #1924) #2158 fixes how the tag count is read on large pages. The parser was interpreting the upper bits of FirstAvailablePageTag as part of the count, which made it think there were more tags than actually existed.
  • This PR adds bounds handling for large-page entries whose decoded valueSize is 0 or 1. This is a valid hardening change, and valueSize == 0 is documented by the ESE format as a possible case.

Issue #1924 happens because, once the tag count is misread, the parser walks past the real tag array and starts interpreting ordinary page bytes as if they were additional tags. At that point it can decode bogus entries with valueSize
== 0 or 1, which is the immediate crash that #2165 avoids. So with #2158 alone, this out-of-bounds walk no longer happens.

In other words, #2158 fixes the root cause by making the parser stop at the real end of the tag table, while this PR hardens large-page tag parsing in two ways: it avoids one crash that can happen after reading past the real tag table (which should not happen anymore after #2158),
and it also handles the separate, spec-valid case where a tag’s valueSize is 0, which was not previously taken into account.

The final piece of the puzzle is finding a sample that contains a correctly parsed tag with valueSize == 0; the spec says this is possible, and such a sample would demonstrate that this PR is necessary for those scenarios.

@juliosuas
Copy link
Copy Markdown
Author

Thanks for the careful framing — agree with how you've split it. Your #2158 stops the out-of-bounds walk by reading FirstAvailablePageTag correctly, so the bogus valueSize == 0/1 tags my PR was catching shouldn't show up in normal parsing anymore once that lands.

On the spec-valid valueSize == 0 case: I don't have a clean NTDS.dit sample where it occurs naturally — the only way I hit it was through the out-of-bounds walk #2158 fixes. The ESE format docs allow it (no value, common-key only) and libesedb handles it explicitly in its tag/value parsing, but I haven't been able to capture a real-world dit that has it on a properly-bounded tag.

Happy to either:

  1. Close this PR if you'd rather merge Fix large-page ESE tag-state parsing for Windows Server 2025 NTDS.dit (issue #1924) #2158 alone (since it solves the actual crash), or
  2. Rebase mine on top of Fix large-page ESE tag-state parsing for Windows Server 2025 NTDS.dit (issue #1924) #2158 once it lands, so the bounds handling sits as defense-in-depth without overlapping logic, or
  3. Keep both as-is and let the maintainers decide based on whether they want defensive-but-untested-in-the-wild guards.

Whichever you / the maintainers prefer. Thanks again for taking the time on this.

@anadrianmanrique anadrianmanrique added enhancement Implemented features can be improved or revised low Low priority item and removed in review This issue or pull request is being analyzed labels Apr 29, 2026
@anadrianmanrique
Copy link
Copy Markdown
Collaborator

The best for now will be to keep the PR open, at least until we find some scenario that could exercise this particular changes

Windows Server 2025 uses 32KB (32768 byte) database pages in NTDS.dit
instead of the previous 8KB (8192 byte) pages. The ESE parser's getTag()
method crashed with 'IndexError: bytearray index out of range' when
processing these larger pages.

Root cause: For large pages (>8KB) with format revision >= 17, the page
tag flags are stored in the upper 3 bits of the first 16-bit value of
the entry data itself (not in the tag entry). The code accessed
tmpData[1] unconditionally, but when valueSize is 0 (empty entries like
leaf/branch page headers with no common key), tmpData is an empty
bytearray, causing the IndexError.

Fix: Add bounds checking before accessing tmpData[1]:
- valueSize >= 2: extract flags from tmpData[1] as before (normal path)
- valueSize == 1: set flags to 0, return single byte as-is
- valueSize == 0: set flags to 0, return empty bytes (was crashing)

This preserves backward compatibility with 8KB pages (the else branch
is unchanged) and follows the ESE format specification from libyal.

Fixes fortra#1924
@juliosuas juliosuas force-pushed the fix/ese-32kb-page-support branch from 8b3bef8 to 983272d Compare May 22, 2026 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Implemented features can be improved or revised low Low priority item

Projects

None yet

Development

Successfully merging this pull request may close these issues.

secretsdump.py does not parse Windows Server 2025 NTDS.dit

4 participants