RUM-16442: Fix profiler safe memory reads by simaoseica-dd · Pull Request #2967 · DataDog/dd-sdk-ios

simaoseica-dd · 2026-06-02T14:49:51Z

What and why?

Fixes a crash in the iOS Mach profiler caused by reading stale or invalid frame pointers during stack unwinding.

The previous safe-read path, introduced in #2605, relied on faulting memory access recovery but invalid stack memory could still terminate the app with EXC_BAD_ACCESS / SIGSEGV.
This PR makes stack memory reads non-faulting so the profiler degrades safely instead of crashing.

How?

Replaces direct memory reads during stack sampling with vm_read_overwrite.
The profiler now first tries to read the expected stack region in one batch for performance. If that read fails, it falls back to page-by-page reads from the stack pointer and keeps the readable prefix, allowing partial stack recovery without touching invalid memory.

Review checklist

Feature or bugfix MUST have appropriate tests (unit, integration)
Make sure each commit and the PR mention the Issue number or JIRA reference
Add CHANGELOG entry for user facing changes
Add Objective-C interface for public APIs - see our guidelines (internal)
Run make api-surface when adding new APIs

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 678a9f1e2a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-03T14:35:06Z

+        const size_t region_size = std::min(
+            current_frame_size + (static_cast<size_t>(max_depth) * STACK_REGION_FRAME_SIZE_ESTIMATE),
+            STACK_REGION_MAX_READ


Avoid truncating stacks at the heuristic read window

When a sampled thread has any caller frame more than about max_depth * 128 bytes above the current frame pointer (or a current frame larger than the 64 KB cap), the saved FP/PC pair remains mapped but lies outside bytes_read, so walk_frames_in_buffer stops after the leaf frames. The previous per-frame read followed valid frame pointers up to max_depth; this heuristic window silently drops the rest of otherwise valid stacks for code with large stack allocations, biasing collected profiles rather than just degrading on invalid memory.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-03T14:35:06Z

+            STACK_REGION_MAX_READ
+        );
+
+        bytes_read = read_stack_region(sp, stack_buf, region_size);


Do not call Mach VM reads while the target thread is suspended

In the default all-thread mode this call runs from sample_thread after thread_suspend, despite that critical section explicitly warning to avoid system calls that may acquire locks. If the sampled thread is suspended while it is in a VM operation (for example mmap/deallocation or another path holding the task VM map lock), vm_read_overwrite has to enter the same Mach VM machinery and can block forever, leaving the target thread suspended and hanging the app; the old direct read avoided introducing a Mach syscall in this suspended region.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b5320784f8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-03T16:32:53Z

    }

+    if (has_thread_state) {
+        stack_trace_sample_context(&trace, fp, pc, sp, config.max_stack_depth);


Keep the stack snapshot synchronized with the captured registers

When the sampled thread resumes before stack_trace_sample_context copies the stack, any thread that returns from the captured frame or pushes new frames in that gap is no longer guaranteed to have stack memory matching the suspended fp/pc/sp values. The walker then combines a PC from the suspended instant with caller frames from later execution (or stops early after the stack is reused), which can corrupt profiler samples for actively running threads rather than representing a consistent stack.

Useful? React with 👍 / 👎.

fuzzybinary

This LGTM other than some minor nits about style, but I'm trusting your tests exercise this properly as I'm a little out of my depth on the specifics of stack walking for profiling.

fuzzybinary · 2026-06-03T17:38:03Z

+    const uintptr_t address = reinterpret_cast<uintptr_t>(addr);
+    if (address > UINTPTR_MAX - size) {
+        return 0;
+    }


Is this supposed to be a check on addr or on buf? Or should it check both? I would think since addr is the read address, if were going to pass UINTPTR_MAX than it would just return the amount read when calling vm_read_overwirte.

fuzzybinary · 2026-06-03T17:42:58Z

+        }
+
+        total_read += bytes_read;
+        if (total_read == size || current_address > UINTPTR_MAX - bytes_read) {


I'm confused about this check:

current_address > UINTPTR_MAX - bytes_read

If this is true, doesn't that mean read_memory_region has already read past UINTPTR_MAX?

fuzzybinary · 2026-06-03T17:49:35Z

+    size_t bytes_read,
+    void* next_frame[2]
+) {
+    if (stack_base == 0 || stack_buf == nullptr || bytes_read < sizeof(void*[2]) || fp < stack_base) {


if you can uses constexpr it might be worth factoring out the repeated sizeof checks:

constexpr size_t read_size = sizeof(void*[2]);

Actually, this might be a larger refactor, but I'm seeing a lot of use of void*[2] -- refactoring this into a struct would make this easier to read with no impact on overhead:

struct Frame { void* next_frame_pointer; void* return _addresss; }

Maybe worth a the change sometime in the future.

This comment has been minimized.

Sign in to view

simaoseica-dd force-pushed the simaoseica/RUM-16442/use-vm_read_overwrite branch 2 times, most recently from c49f01f to 678a9f1 Compare June 3, 2026 10:33

simaoseica-dd marked this pull request as ready for review June 3, 2026 14:28

simaoseica-dd requested review from a team as code owners June 3, 2026 14:28

simaoseica-dd requested review from awforsythe and fuzzybinary June 3, 2026 14:29

chatgpt-codex-connector Bot reviewed Jun 3, 2026

View reviewed changes

RUM-16442: Fix profiler safe memory reads

b532078

simaoseica-dd force-pushed the simaoseica/RUM-16442/use-vm_read_overwrite branch from 678a9f1 to b532078 Compare June 3, 2026 16:28

chatgpt-codex-connector Bot reviewed Jun 3, 2026

View reviewed changes

fuzzybinary approved these changes Jun 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RUM-16442: Fix profiler safe memory reads#2967

RUM-16442: Fix profiler safe memory reads#2967
simaoseica-dd wants to merge 1 commit into
feature/profilingfrom
simaoseica/RUM-16442/use-vm_read_overwrite

simaoseica-dd commented Jun 2, 2026 •

edited

Loading

Uh oh!

This comment has been minimized.

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 3, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 3, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 3, 2026

Uh oh!

fuzzybinary left a comment

Uh oh!

fuzzybinary Jun 3, 2026

Uh oh!

fuzzybinary Jun 3, 2026

Uh oh!

fuzzybinary Jun 3, 2026

Uh oh!

fuzzybinary Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

simaoseica-dd commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What and why?

How?

Review checklist

Uh oh!

This comment has been minimized.

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

fuzzybinary left a comment

Choose a reason for hiding this comment

Uh oh!

fuzzybinary Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

fuzzybinary Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

fuzzybinary Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

fuzzybinary Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

simaoseica-dd commented Jun 2, 2026 •

edited

Loading