Fix large-page ESE tag-state parsing for Windows Server 2025 NTDS.dit (issue #1924) by alexisbalbachan · Pull Request #2158 · fortra/impacket

alexisbalbachan · 2026-03-27T08:36:29Z

This PR fixes the ESE page-header parsing bug behind issue #1924, where secretsdump.py failed to parse Windows Server 2025 NTDS.dit files using 32 KB ESE pages.

Root cause:

FirstAvailablePageTag was treated as a plain 16-bit tag count.
On large page databases, the low 12 bits contain the tag count, while the upper 4 bits appear to encode reserved tag state.
This resulted in values such as 0x100c to be interpreted as 4108 tag count instead of just 12, causing Impacket to walk past the last valid tag and crash with IndexError.

Fix:

FirstAvailablePageTag is now split into 2 values:
- Upper 4 bits are stored as tagReserved, there's not much information about this, dissect.esedb treats it as a counter for reserved tags which is then used to calculate the actual logical node count. I did not implement this because i could not manage to create a dump which had tagReserved > 1
- Remaining 12 bits are used for the tag count (instead of the 16 that were used previously)
These changes are in line with what dissect.esedb does

Additional Fix:

secretsdump and raiseChild assumed USER_PROPERTIES structure always contains PropertyCount and UserProperties. Those fields are optional, and i obtained a dump in which a user had a valid zero-property supplementalCredentials blob, where SAMR omits PropertyCount. Because we modeled PropertyCount as unconditional, the parser consumed later bytes as the count and then failed while decoding a non-existent USER_PROPERTY, producing an Error while processing that user.
This was fixed by parsing only the fixed USER_PROPERTIES header in samr.py and handle the optional tail manually based on Length. When the blob is the zero-property form, we now return PropertyCount = 0 and empty property data; otherwise we parse PropertyCount, the property buffer, and Reserved5 explicitly. secretsdump.py and raiseChild.py were updated to use that helper, and both now discard malformed trailing property data safely instead of crashing the whole row.

Copilot

Pull request overview

Fixes ESE page-header/tag-state parsing to support Windows Server 2025 NTDS.dit files using 32 KiB pages, preventing secretsdump.py from walking past the tag array and crashing.

Changes:

Introduce masking/splitting of FirstAvailablePageTag to derive an effective tag count on large pages.
Store derived tagCount/tagReserved on ESENT_PAGE.
Update tag iteration/slicing across page parsing code paths to use tagCount instead of the raw header field.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-27T08:41:12Z

+                    # Logical node counts should be derived from the effective reserved-tag count
+                    # instead of assuming only tag 0 is reserved, the logical node count should be tagCount - tagReserved.
+                self.tagReserved = (self.record['FirstAvailablePageTag'] >> FIRST_AVAILABLE_PAGE_TAG_RESERVED_SHIFT) or 1
+                self.tagCount = self.record['FirstAvailablePageTag'] & FIRST_AVAILABLE_PAGE_TAG_MASK


tagReserved is parsed for large pages but is never used when iterating tags (most loops still start at 1). If tagReserved can be > 1, the parser may still try to interpret reserved tags as records; consider iterating from tagReserved (or otherwise skipping reserved tags) when walking leaf/branch tags.

Suggested change

# Logical node counts should be derived from the effective reserved-tag count

# instead of assuming only tag 0 is reserved, the logical node count should be tagCount - tagReserved.

self.tagReserved = (self.record['FirstAvailablePageTag'] >> FIRST_AVAILABLE_PAGE_TAG_RESERVED_SHIFT) or 1

self.tagCount = self.record['FirstAvailablePageTag'] & FIRST_AVAILABLE_PAGE_TAG_MASK

# Logical node counts should be derived from the effective reserved-tag count

# instead of assuming only tag 0 is reserved, the logical node count should be tagCount - tagReserved.

raw_tag_field = self.record['FirstAvailablePageTag']

self.tagReserved = (raw_tag_field >> FIRST_AVAILABLE_PAGE_TAG_RESERVED_SHIFT) or 1

physicalTagCount = raw_tag_field & FIRST_AVAILABLE_PAGE_TAG_MASK

# On large pages, adjust tagCount so it represents the logical node count (excluding all reserved tags).

# When tagReserved == 1 (the legacy assumption), this reduces to the original behavior.

self.tagCount = physicalTagCount - (self.tagReserved - 1)

Intentional. This PR fixes the confirmed #1924 crash by correcting the large-page
tag count parsing. tagReserved is modeled for parity with dissect.esedb, but
I am not changing logical-node traversal without a sample exposing effective
tagReserved > 1, since that would require a broader change than this bug fix.

Copilot · 2026-03-27T08:41:13Z

-        for i in range(self.record['FirstAvailablePageTag']):
+        for i in range(self.tagCount):
            tag = tags[-4:]
            if self.__DBHeader['Version'] == 0x620 and self.__DBHeader['FileFormatRevision'] > 11 and self.__DBHeader['PageSize'] > 8192:


In dump(), the large-page tag decoding check uses FileFormatRevision > 11, but elsewhere the Windows 7+ boundary is treated as >= 0x11 / >= 17 (see getTag() and ESENT_PAGE_HEADER). To keep behavior consistent and avoid applying the large-page decoding to revisions 0x0c–0x10, update this condition to match the same threshold used elsewhere.

Suggested change

if self.__DBHeader['Version'] == 0x620 and self.__DBHeader['FileFormatRevision'] > 11 and self.__DBHeader['PageSize'] > 8192:

if self.__DBHeader['Version'] == 0x620 and self.__DBHeader['FileFormatRevision'] >= 0x11 and self.__DBHeader['PageSize'] > 8192:

I agree the condition is inconsistent with the rest of the module, but it is outside the crash path fixed here and I do not have a sample showing that revisions 0x0c..0x10 are mishandled by dump(). I’d prefer to keep this PR scoped
to the reproducible

Copilot · 2026-03-27T08:41:13Z

+            self.tagCount = self.record['FirstAvailablePageTag']
+            if self.__DBHeader['FileFormatRevision'] >= 0x11 and self.__DBHeader['PageSize'] > 8192:
+                # TODO: The upper 4 bits may encode how many leading tags are reserved on large pages.
+                    # Logical node counts should be derived from the effective reserved-tag count
+                    # instead of assuming only tag 0 is reserved, the logical node count should be tagCount - tagReserved.
+                self.tagReserved = (self.record['FirstAvailablePageTag'] >> FIRST_AVAILABLE_PAGE_TAG_RESERVED_SHIFT) or 1
+                self.tagCount = self.record['FirstAvailablePageTag'] & FIRST_AVAILABLE_PAGE_TAG_MASK


This change fixes a specific crash/regression for 32 KiB pages by masking FirstAvailablePageTag to 12 bits. Please add a regression test (unit-level if possible) that builds/parses a page header where FirstAvailablePageTag has high bits set (e.g., 0x100c) and asserts tagCount == 0x000c and tag iteration does not raise.

Copilot · 2026-03-27T08:41:13Z

+            self.tagCount = self.record['FirstAvailablePageTag']
+            if self.__DBHeader['FileFormatRevision'] >= 0x11 and self.__DBHeader['PageSize'] > 8192:
+                # TODO: The upper 4 bits may encode how many leading tags are reserved on large pages.
+                    # Logical node counts should be derived from the effective reserved-tag count
+                    # instead of assuming only tag 0 is reserved, the logical node count should be tagCount - tagReserved.
+                self.tagReserved = (self.record['FirstAvailablePageTag'] >> FIRST_AVAILABLE_PAGE_TAG_RESERVED_SHIFT) or 1
+                self.tagCount = self.record['FirstAvailablePageTag'] & FIRST_AVAILABLE_PAGE_TAG_MASK


The large-page condition in ESENT_PAGE.__init__ only checks FileFormatRevision/PageSize, but other large-page parsing logic in this module (e.g., getTag()) also gates on Version == 0x620. Consider aligning the predicate here with getTag() to avoid masking FirstAvailablePageTag on database versions that don’t use the 12-bit tag-count encoding.

Copilot · 2026-03-27T08:41:13Z

+                    # Logical node counts should be derived from the effective reserved-tag count
+                    # instead of assuming only tag 0 is reserved, the logical node count should be tagCount - tagReserved.


The comment block under the large-page if is over-indented (lines after the TODO have extra indentation). This makes the code harder to read; align the comment indentation with the rest of the block.

Suggested change

# Logical node counts should be derived from the effective reserved-tag count

# instead of assuming only tag 0 is reserved, the logical node count should be tagCount - tagReserved.

# Logical node counts should be derived from the effective reserved-tag count

# instead of assuming only tag 0 is reserved, the logical node count should be tagCount - tagReserved.

gabrielg5 · 2026-04-09T15:37:58Z

check together with #2165

added testcase

anadrianmanrique · 2026-04-30T14:40:59Z

all tests passed ok. merging now

Fix issue fortra#1924 large-page tag count parsing

b1d51f6

Copilot AI review requested due to automatic review settings March 27, 2026 08:36

Copilot started reviewing on behalf of alexisbalbachan March 27, 2026 08:36 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

Applied code review changes, added unit test

42fe79c

anadrianmanrique self-assigned this Apr 16, 2026

anadrianmanrique added medium Medium priority item bug Unexpected problem or unintended behavior labels Apr 16, 2026

alexisbalbachan mentioned this pull request Apr 20, 2026

fix: support 32KB page sizes in ESE parser for Windows Server 2025 #2165

Open

anadrianmanrique and others added 2 commits April 23, 2026 11:42

improved code

3cf4a82

added testcase

Fix USER_PROPERTIES parsing per MS-SAMR spec

de1e923

anadrianmanrique merged commit 426f354 into fortra:master Apr 30, 2026
7 checks passed

anadrianmanrique linked an issue Apr 30, 2026 that may be closed by this pull request

secretsdump.py does not parse Windows Server 2025 NTDS.dit #1924

Closed

This was referenced Apr 30, 2026

secretsdump.py does not parse Windows Server 2025 NTDS.dit #1924

Closed

structure.py is not working on Windows 11, 25H2 for srum.dat #2100

Closed

This was referenced May 27, 2026

secretsdump.py does not complete, #1533

Open

secretsdump.py: ('unpack requires a buffer of 2 bytes', "When unpacking field 'NameLength | <H=0 | b''[:2]'" #1099

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix large-page ESE tag-state parsing for Windows Server 2025 NTDS.dit (issue #1924)#2158

Fix large-page ESE tag-state parsing for Windows Server 2025 NTDS.dit (issue #1924)#2158
anadrianmanrique merged 4 commits into
fortra:masterfrom
alexisbalbachan:2025_ntds_parse_fix

alexisbalbachan commented Mar 27, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 27, 2026

Uh oh!

alexisbalbachan Mar 27, 2026

Uh oh!

Copilot AI Mar 27, 2026

Uh oh!

alexisbalbachan Mar 27, 2026

Uh oh!

Copilot AI Mar 27, 2026

Uh oh!

Copilot AI Mar 27, 2026

Uh oh!

Copilot AI Mar 27, 2026

Uh oh!

gabrielg5 commented Apr 9, 2026

Uh oh!

anadrianmanrique commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	if self.__DBHeader['Version'] == 0x620 and self.__DBHeader['FileFormatRevision'] > 11 and self.__DBHeader['PageSize'] > 8192:
	if self.__DBHeader['Version'] == 0x620 and self.__DBHeader['FileFormatRevision'] >= 0x11 and self.__DBHeader['PageSize'] > 8192:

		# Logical node counts should be derived from the effective reserved-tag count
		# instead of assuming only tag 0 is reserved, the logical node count should be tagCount - tagReserved.

Conversation

alexisbalbachan commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

alexisbalbachan Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

alexisbalbachan Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

gabrielg5 commented Apr 9, 2026

Uh oh!

anadrianmanrique commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alexisbalbachan commented Mar 27, 2026 •

edited

Loading