Skip to content

Decrement reference count of old token when template token is replaced.#581

Merged
edwardalee merged 3 commits into
mainfrom
memory-leak-fix
Apr 17, 2026
Merged

Decrement reference count of old token when template token is replaced.#581
edwardalee merged 3 commits into
mainfrom
memory-leak-fix

Conversation

@edwardalee
Copy link
Copy Markdown
Contributor

This PR addresses #580.

Copilot AI review requested due to automatic review settings April 17, 2026 16:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to fix issue #580 by preventing a ref-count–related memory leak when a template token is replaced, ensuring the old template token’s reference is released when a new token is created.

Changes:

  • Update _lf_get_token() to release the template’s reference to the old shared token when allocating a new token.
  • Install the newly created token as tmplt->token rather than leaving the template pointing at the old shared token.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread core/lf_token.c
@edwardalee
Copy link
Copy Markdown
Contributor Author

This fix exposed another bug which caused the Python test ConcurrentActionRepeat.lf to fail intermittently. Claud's analysis:

The remaining race

Even with the env lock held across py_schedule, the race was between the scheduler thread (popping events) and worker threads (running reaction prologues):

  1. _lf_pop_events calls _lf_replace_template_token then _lf_done_using(token), leaving trigger->tmplt.token at ref_count == 1, then releases env lock.
  2. Before the worker thread acquires env lock for the reaction prologue, a different Python thread's py_schedule acquires env lock and enters _lf_initialize_token_with_value → _lf_get_token.
  3. _lf_get_token sees ref_count == 1 and takes the REUSE path — freeing token->value and overwriting it with the new schedule's value.
  4. When the reaction prologue finally runs, it reads the corrupted payload → the original event sees the new value, producing a "Duplicate action".

This race was always present in principle but was masked before by an extra "phantom" ref (the original memory leak). The leak fix removed the phantom, making the race reliably reproducible.

The fix

py_schedule now allocates a fresh token via lf_new_token instead of _lf_initialize_token_with_value. lf_new_token does not touch trigger->tmplt.token:

  LF_CRITICAL_SECTION_ENTER(env);

  // Check to see if value exists
  if (value) {
    // Allocate a fresh token for this schedule call rather than routing through
    // _lf_initialize_token_with_value / _lf_get_token. Those paths may reuse
    // or replace trigger->tmplt.token, which races with the reaction prologue
    // that reads trigger->tmplt.token->value after an event pop:
    // ...
    trigger->tmplt.type.element_size = sizeof(PyObject*);
    t = lf_new_token((void*)&trigger->tmplt, value, 1);
#if !defined NDEBUG
    LF_CRITICAL_SECTION_ENTER(GLOBAL_ENVIRONMENT);
    extern int _lf_count_payload_allocations;
    _lf_count_payload_allocations++;
    LF_CRITICAL_SECTION_EXIT(GLOBAL_ENVIRONMENT);
#endif
    Py_INCREF(value);
    act->value = value;
  }

  lf_schedule_trigger(env, trigger, offset, t);
  lf_notify_of_event(env);
  LF_CRITICAL_SECTION_EXIT(env);

Now:

  • Each py_schedule creates its own event-local token. The template trigger->tmplt.token is only modified by _lf_pop_events, which is serialized with reaction execution.
  • lf_schedule_trigger bumps the event-local token's ref from 0 → 1.
  • At pop time, _lf_replace_template_token installs the event-local token into the trigger (ref → 2), then _lf_done_using drops it to 1 (held by template).
  • The reaction prologue reads the template token's payload, which has not been mutated by any concurrent scheduler.
  • The payload allocation counter is incremented manually to keep the leak-detection invariant balanced.

This mirrors the same pattern that py_port_set already uses for ports (line 74 of python_port.c).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread python/lib/pythontarget.c
Comment thread python/lib/pythontarget.c
Comment thread python/lib/pythontarget.c
@edwardalee edwardalee merged commit 51034b9 into main Apr 17, 2026
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants