Skip to content

WIP: Feat/fairness scheduler#570

Closed
chall37 wants to merge 77 commits into
gnachman:masterfrom
chall37:feat/fairness-scheduler-rebase
Closed

WIP: Feat/fairness scheduler#570
chall37 wants to merge 77 commits into
gnachman:masterfrom
chall37:feat/fairness-scheduler-rebase

Conversation

@chall37
Copy link
Copy Markdown
Contributor

@chall37 chall37 commented Jan 28, 2026

This is a non-feature gated implementation of the entire pipeline. I still need to shake down some issues, but I think the architecture is in place. I'm still working on the feature-gated parallel implementation as described in #567 and #568 but it's a heavy lift. In the meantime, this can serve as a point of direct comparison between legacy and proposed architectural changes.

Fairness Scheduler PR - Changes Summary

This document summarizes changes in this PR compared to gnachman:master.

Problem Statement

iTerm2 has two starvation points that cause one high-throughput session (e.g., running yes) to block other sessions:

  1. Read-level starvation (TaskNotifier thread): A single-threaded select() loop monitors all PTY file descriptors. When TokenExecutor.addTokens() blocks on semaphore.wait(), the entire TaskNotifier thread is blocked, preventing ALL other sessions from being read.

  2. Execution-level starvation (mutation queue): All sessions share a single mutation queue. Whichever session's execute() runs first processes all its tokens with no mechanism ensuring sessions take turns.

Solution Overview

Replace blocking I/O with event-driven dispatch sources and add a round-robin fairness scheduler:

  1. Per-PTY dispatch sources - Decouples reading so one session can't block others
  2. Non-blocking backpressure - Replace semaphore blocking with suspend/resume
  3. FairnessScheduler - Coordinates round-robin execution on the mutation queue
  4. Token budget per turn - Execute groups until ~500 tokens consumed, then yield

Discussion (continued from PR #560):

"Fairness means that each PTY has some of its tokens executed after waiting for other PTYs to have no more than one turn getting their tokens executed."

"A round-robin scheduler 'busy list' that tracks a list of PTYs that have tokens to execute provides exactly that."

"This plays naturally into using dispatch_source for reading rather than select()."

  • Busy list

Implemented as described. Sessions are added to the tail when they have work, removed from the head when granted a turn, and moved to the tail if work remains after the turn.

  • Token budget

The gist pseudocode used a fixed chunk size. This implementation uses a ~500-token budget checked at group boundaries, ensuring atomic group execution while providing approximate fairness. (This is a lever we can adjust, if necessary.)

  • Read suspension

Currently, TokenExecutor.addTokens blocks on a semaphore to backpressure PTY reads. In this implementation, the blocking wait is replaced with per-PTY reads gated by dispatch sources in PTYTask (the read source starts suspended; updateReadSourceState resumes or suspends it).

  • Per-PTY vs global backpressure: A global capacity counter gating admission across all producers was suggested in Add backpressure counter for monitoring token queue load #560, plus per-producer thresholds for self-throttling. This implementation uses per-PTY only—each TokenExecutor tracks its own availableSlots with no global counter. This should be sufficient for isolation—when Session A is full, only A suspends while B and C continue reading, and it maps directly to per-PTY dispatch sources without requiring scheduler coordination. With the current per-session limit of 64 TokenArrays, this seems acceptable, though global backpressure could be added later if memory pressure becomes a concern.

    • Implementation cost

      • New plumbing: BackpressureLevel / availableSlots, consumption callbacks in TokenExecutor, and backpressureReleaseHandler wiring in PTYSession.
      • Added dispatch-source read/write state management in PTYTask to suspend and resume I/O based on backpressureLevel, paused, and ioAllowed.
    • Alternatives considered

      • Global capacity counter: Add a shared counter across all sessions. Suspend all reads when total queued tokens exceed a system-wide limit. Provides tighter memory bounds under heavy multi-session load at the cost of reduced isolation and greater overhead.
      • Acceptance-based resumption: Resume reading when the scheduler grants a turn rather than after consumption. Simpler signaling (scheduler → PTYTask directly), but potentially more complicated coordination, such as preemptive slot reservation to avoid enqueueing into a full queue at turn-start.
      • Scheduler-driven suspension: More centralized control, but adds coupling between scheduler and I/O, blurring SoC.
      • No backpressure: Remove read-side throttling entirely; let queues grow unbounded. Not viable without other memory limits.
  • Coprocess handling: TaskNotifier is retained in a reduced role for coprocess FD handling and process lifecycle management. Full elimination of TaskNotifier is deferred to a future phase to limit scope and risk.

Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                         READING (Decoupled)                         │
│                                                                     │
│   PTYTask A              PTYTask B              PTYTask C           │
│   dispatch_source        dispatch_source        dispatch_source     │
│        │                      │                      │              │
│        ▼                      ▼                      ▼              │
│   TokenExecutor A        TokenExecutor B        TokenExecutor C     │
│   (suspend/resume)       (suspend/resume)       (suspend/resume)    │
└────────┼──────────────────────┼──────────────────────┼──────────────┘
         │                      │                      │
         └──────────────────────┼──────────────────────┘
                                ▼
┌─────────────────────────────────────────────────────────────────────┐
│                 FAIRNESS SCHEDULER (Gatekeeper)                     │
│                                                                     │
│   Busy List: [A] → [B] → [C] → ...  (round-robin)                   │
│   Grants permission: "Execute groups until ~500 tokens"             │
└─────────────────────────────────────────────────────────────────────┘

Key Changes by Component

1. FairnessScheduler (NEW)

File: sources/FairnessScheduler.swift

Round-robin scheduler that controls which session executes on the mutation queue. Does NOT buffer tokens - only controls execution order. Each session gets a turn with a ~500 token budget before yielding.

Justification: Provides deterministic fairness guarantees. No session can monopolize execution regardless of output rate.

2. PTYTask Dispatch Sources

Files: sources/PTYTask.h, sources/PTYTask.m

Adds per-PTY dispatch_source_t for read and write operations:

  • _readSource - DISPATCH_SOURCE_TYPE_READ
  • _writeSource - DISPATCH_SOURCE_TYPE_WRITE
  • Unified state predicates (shouldRead/shouldWrite)
  • State update methods (updateReadSourceState/updateWriteSourceState)

Justification: Decouples PTY I/O from the shared TaskNotifier thread. One session's backpressure cannot block another session's reads.

3. TokenExecutor Modifications

File: sources/TokenExecutor.swift

  • Removes semaphore blocking - addTokens() never blocks
  • Adds executeTurn() - Fairness-limited execution with token budget
  • Adds backpressureReleaseHandler - Closure to signal PTYTask when backpressure clears
  • Removes activeSessionsWithTokens - Legacy foreground preemption no longer needed

Justification: Non-blocking token admission is required for dispatch sources (handlers must return quickly). The fairness scheduler replaces the ad-hoc foreground preemption.

4. TaskNotifier Changes

Files: sources/TaskNotifier.h, sources/TaskNotifier.m

  • PTY tasks with dispatch sources skip select() for their FDs
  • Still handles: coprocess FDs, deadpool/waitpid, unblock pipe
  • New useDispatchSource protocol method (@optional)

Justification: Hybrid approach allows incremental migration. Coprocesses remain on select() for now (rare feature, low risk).

5. VT100ScreenMutableState Integration

File: sources/VT100ScreenMutableState.m

  • Registers TokenExecutor with FairnessScheduler on init
  • Unregisters on setEnabled:NO
  • Stores _fairnessSessionId for stable identification

Justification: Lifecycle management ensures sessions are properly tracked by the scheduler.

6. TwoTierTokenQueue

File: sources/TwoTierTokenQueue.swift

  • Adds discardAllAndReturnCount() for cleanup accounting

Justification: Prevents availableSlots drift when sessions close with unconsumed tokens.

7. PTYSession Wiring

File: sources/PTYSession.m

  • Wires backpressureReleaseHandler to updateReadSourceState
  • Adds re-kick calls via scheduleTokenExecution for state transitions

Justification: Connects TokenExecutor backpressure signals to PTYTask dispatch source control.

Location: ModernTests/FairnessScheduler/

Test Methods organized by component:

Test File Coverage Area
FairnessSchedulerTests.swift Round-robin scheduling, busy list management, session lifecycle, edge cases (empty list, single session, rapid add/remove)
TokenExecutorFairnessTests.swift Token budget enforcement, executeTurn() behavior, backpressure handler wiring, accounting invariants, high-priority token handling
PTYTaskDispatchSourcePredicateTests.swift shouldRead/shouldWrite predicate logic under all condition combinations (paused, ioAllowed, backpressure, writeBuffer)
PTYTaskDispatchSourceHandlerTests.swift Read/write event handler behavior, coprocess data bridging, error handling
PTYTaskDispatchSourceBasicTests.swift Dispatch source lifecycle (setup, teardown, resume-before-cancel)
PTYTaskDispatchSourceIntegrationTests.swift End-to-end PTYTask I/O flow with real file descriptors
TaskNotifierDispatchSourceTests.swift Hybrid mode (dispatch source tasks skip select), coprocess FD handling preserved
TwoTierTokenQueueTests.swift discardAllAndReturnCount(), queue accounting
IntegrationTests.swift Full system integration: multiple sessions, fairness under load, session close with pending tokens
PTYSessionWiringTests.swift Backpressure handler wiring, re-kick on state transitions

Mock/utility infrastructure: MockTokenExecutorDelegate, MockPTYTaskDelegate, EnqueuingPTYTaskDelegate, SpyVT100Screen, and test utilities.

Key invariants:

  • availableSlots accounting (no drift after enqueue/consume cycles)
  • Round-robin ordering (each session permitted exactly one turn before others)
  • Dispatch source state consistency (suspend count never goes negative)
  • Graceful handling of session close with unconsumed tokens

Files Unchanged

Component Why Unchanged
TwoTierTokenQueue (mostly) Per-session high-priority handling already correct
High-priority token paths Triggers, API injection work correctly within session turns
Side effects system Unrelated to fairness (30fps cadence on main queue)
Mutation queue Still single-threaded; fairness controls who uses it

gnachman and others added 30 commits January 27, 2026 18:56
Add FAST_INT_ACCESSOR pattern to cache integer preferences
similar to existing FAST_BOOL_ACCESSOR. Apply to sideMargins
and topBottomMargins which are called 55 times per render frame
in the drawing hot path.

- Add iTermPreferencesIntCache struct
- Add FAST_INT_ACCESSOR macro
- Add intWithCache: method
- Add sideMargins and topBottomMargins fast accessors
- Update 29 call sites in iTermTextDrawingHelper.m
- Update 26 call sites in PTYTextView.m
tokenExecutorSync was causing severe performance issues with multiple tabs.
Each tab's TokenExecutor would trigger a SynchroDance (main thread blocking
on mutation thread), starving the visible tab's rendering.

The sync is redundant because the cadence-driven refresh path already handles
syncing: visible tabs at ~60Hz, background tabs at 1Hz via updateDisplayBecause:.

Result: Eliminated ~85% of SynchroDance calls, reducing main thread blocking
by ~37% and restoring smooth FPS in visible tabs with 10+ tabs running output.
- Add timeout to whilePaused() to prevent deadlock if mutation queue is stuck
- Crash with it_fatalError on timeout since skipping joined block would leave
  state inconsistent (DEBUG: 2s max, RELEASE: 15s max)
- Add cancellation flag to prevent async block from setting joined state after
  timeout
- Add DEBUG-only reentrancy guard to detect overlapping joins
- Update performSynchroDanceWithBlock: to return BOOL for API clarity
- Add BackpressureLevel enum (none, light, moderate, heavy, blocked)
- Track available slots with atomic counter alongside DispatchSemaphore
- Add onSemaphoreSignaled callback to TokenArray for slot release notification
- Expose backpressureLevel property on TokenExecutor (callable from any queue)

This enables adaptive behavior based on queue load (e.g., join timeouts).
…pass

- Initialize availableSlots with correct value in property declaration using
  immediately-executed closure, avoiding race where backpressureLevel could
  return .blocked before init completes
- Document that high-priority tokens intentionally bypass backpressure, so
  metric only reflects normal PTY token load
Flag potential bugs where queued tokens sit in purgatory until new data
arrives because scheduleTokenExecution is not called when these modes
are disabled (unlike copyModeHandlerDidChangeEnabledState which does).

Also note potential conversion to mutateAsynchronously for single-writer
consistency once the correctness questions are resolved.
Previously, every cadence change would cancel the dispatch source and
create a new one. Now the timer is created once and dispatch_source_set_timer
is used to update timing parameters, eliminating kernel call overhead.
Major refactoring to reduce CPU usage from process monitoring with many tabs:

- Add global coalescer using DISPATCH_SOURCE_TYPE_DATA_OR to merge concurrent
  monitor events into batched refresh cycles
- Implement collection-first refresh that walks existing process collection
  instead of enumerating all PIDs via syscalls
- Suspend monitors for background tabs; only foreground tabs get fast updates
- Add epoch-based tracking to detect stale cache entries
- Preserve dirty PID signaling for session title/status updates
- Only remove cache entries when kill(pid,0) confirms process is dead,
  avoiding transient nil results from collection staleness

Supporting changes:
- iTermProcessMonitor: Add trackedRootPID property and pause/resume methods
- iTermLSOF: Add childPidsForPid: using proc_listchildpids with retry loop
- PseudoTerminal: Call setForegroundRootPIDs: on tab selection
- Throttle backgroundRefreshTick to 0.5s minimum intervals instead of
  running on every updateIfNeeded call
- Batch dispatch_sync calls in refreshRootCollectionFirst: collect PIDs
  in NSMutableIndexSet, single sync at end instead of per-PID
- Same batching fix for refreshRootWithKernelFallback
- Remove unused markSeenForEpoch helper method
- Fix property name: isForeground → isForegroundJob
Implementation plan covers FairnessScheduler, TokenExecutor modifications,
PTYTask dispatch sources, and TaskNotifier changes. Testing plan defines
unit, integration, and smoke tests with checkpoints for incremental
validation.
Documents functional requirements for triggers, API injection, and other
high-priority token categories to preserve during fairness scheduler work.
Test infrastructure for round-robin fair scheduling implementation:

- FairnessSchedulerTests.swift: 18 unit tests covering registration,
  busy list management, turn execution, and round-robin fairness
- MockFairnessSchedulerExecutor: Mock executor for isolated testing
- TokenExecutorFairnessTests.swift: Phase 2 test stubs (8 placeholders)
- FairnessScheduler.swift: Stub implementation (tests fail as expected)
- run_fairness_tests.sh: Isolated test runner with phase filtering
- add_to_xcode_project.py: Helper to add Swift files to Xcode project

Also fixes duplicate FAST_INT_ACCESSOR declarations in iTermPreferences.m

All 18 Phase 1 tests fail against stub, validating test infrastructure.
Run with: ./tools/run_fairness_tests.sh phase1
Replace stub with working implementation that coordinates token
execution across PTY sessions. Key features:
- Session registration with monotonically increasing IDs
- Round-robin busy list with O(1) duplicate prevention via busySet
- Turn execution flow handling completed/yielded/blocked results
- Cleanup callback on unregistration

All 18 Phase 1 tests pass. Uses main queue for now; will switch to
mutation queue during integration phase.
Tests (32 total, 7 passing, 25 skipped pending implementation):
- TokenExecutorNonBlockingTests: Non-blocking addTokens()
- TokenExecutorAccountingTests: Token consumption accounting
- TokenExecutorExecuteTurnTests: executeTurn() method
- TokenExecutorBudgetEdgeCaseTests: Budget enforcement
- TokenExecutorSchedulerEntryPointTests: Scheduler notifications
- TokenExecutorLegacyRemovalTests: Legacy code removal
- TokenExecutorCleanupTests: Cleanup on unregistration
- TokenExecutorAccountingInvariantTests: Accounting invariants

Tests include both positive and negative cases to verify absence
of undesired behavior. Skipped tests use XCTSkip to document
requirements without hanging or failing.

Documentation updates:
- implementation.md: Added status table showing Phase 1 complete
- testing.md: Added test status table with run commands
- Make TokenExecutor conform to FairnessSchedulerExecutor protocol
- Add executeTurn(tokenBudget:completion:) with budget enforcement
- Add cleanupForUnregistration() for accounting cleanup
- Add backpressureReleaseHandler and fairnessSessionId properties
- Remove semaphore blocking from addTokens() - now non-blocking
- Add notifyScheduler() to integrate with FairnessScheduler
- Remove activeSessionsWithTokens legacy preemption code
- Make BackpressureLevel conform to Comparable
- Add discardAllAndReturnCount() to TwoTierTokenQueue
- Rename "Phase" to "Milestone" in documentation for clarity
Implement per-PTY dispatch sources for decoupled I/O handling:

PTYTask.m:
- Add dispatch source ivars (_readSource, _writeSource, _ioQueue)
- setupDispatchSources: Creates read/write sources, starts suspended
- teardownDispatchSources: Safe teardown (resumes before cancel)
- shouldRead/shouldWrite: Unified state predicates
- updateReadSourceState/updateWriteSourceState: Idempotent updates
- handleReadEvent/handleWriteEvent: Event handlers
- writeBufferDidChange: Triggers write state update
- Update setPaused to call state update methods
- Update dealloc to call teardown

PTYTask.h:
- Add tokenExecutor property for backpressure monitoring

Tests:
- Add PTYTaskDispatchSourceTests.swift with 35 tests (all skipped)
- Update run_fairness_tests.sh to support milestone3 filter

Note: Infrastructure is implemented but not yet activated. Integration
(calling setupDispatchSources, wiring backpressureReleaseHandler) will
be done in Milestone 5.
Add 12 TDD-style tests for TaskNotifier dispatch source integration:

- TaskNotifierDispatchSourceProtocolTests (4 tests):
  Verify useDispatchSource @optional method in iTermTask protocol

- TaskNotifierSelectLoopTests (5 tests):
  Verify select() loop skips FD_SET for dispatch source tasks
  while keeping coprocess FDs and unblock pipe

- TaskNotifierMixedModeTests (3 tests):
  Verify mixed operation of dispatch_source + select() tasks

All tests currently skip pending implementation of:
- useDispatchSource @optional method in iTermTask protocol
- PTYTask.useDispatchSource returning YES
- TaskNotifier respondsToSelector: check in select() loop

Update run_fairness_tests.sh with milestone4 filter support.
Modify TaskNotifier to skip FD_SET for tasks using dispatch_source:

- Add useDispatchSource as @optional method in iTermTask protocol
- PTYTask implements useDispatchSource returning YES
- TaskNotifier checks respondsToSelector: before calling useDispatchSource
- FD setup loop skips FD_SET for dispatch source tasks
- Result handling loop skips handleRead/Write/Error for dispatch source tasks
- Coprocess FD handling continues for all tasks (unaffected)
- Backward compatible: tasks not implementing method use select()

This decouples PTY I/O from the TaskNotifier select() loop, allowing
per-PTY dispatch sources to handle read/write independently.
Add 24 TDD-style tests for system integration:

- IntegrationRegistrationTests (3): FairnessScheduler registration in init
- IntegrationUnregistrationTests (3): Cleanup on setEnabled:NO
- IntegrationRekickTests (4): Re-kick scheduler on unblock events
- IntegrationMutationQueueTests (2): Proper mutation queue usage
- IntegrationDispatchSourceActivationTests (3): Dispatch source setup
- IntegrationPTYSessionWiringTests (2): Full session wiring
- DispatchSourceLifecycleIntegrationTests (3): Process lifecycle
- BackpressureIntegrationTests (3): Backpressure system
- SessionLifecycleIntegrationTests (4): Session close edge cases

All tests skip pending implementation of:
- VT100ScreenMutableState registration with FairnessScheduler
- PTYSession wiring of backpressureReleaseHandler
- PTYTask.setupDispatchSources activation in launch code

Update run_fairness_tests.sh with milestone5 filter support.
Wire all components together to activate the fairness scheduler:

VT100ScreenMutableState:
- Register TokenExecutor with FairnessScheduler in init
- Store fairnessSessionId on executor and mutable state
- Unregister in setTerminalEnabled:NO before clearing delegate

PTYSession:
- Wire backpressureReleaseHandler in taskDidRegister:
- Set task.tokenExecutor for backpressure monitoring
- Handler calls task.updateReadSourceState when backpressure releases

PTYTask:
- Call setupDispatchSources in didRegister (covers launch, attach, restore)
- Dispatch sources activated when task registers with TaskNotifier

Update Milestone 4 tests to verify implementation instead of skipping.

The fairness scheduler is now fully integrated:
- Round-robin scheduling ensures fair token execution across sessions
- Per-PTY dispatch sources handle I/O independently
- Backpressure flows from TokenExecutor to PTYTask read source state
- TaskNotifier skips dispatch_source tasks in select() loop
Add thorough test coverage for the fairness scheduler:

FairnessScheduler tests:
- Thread safety tests (concurrent register/unregister/enqueue)
- Lifecycle edge cases (unregister during execution, double unregister)
- Note: Thread safety tests expose real synchronization bugs

TokenExecutor tests:
- Completion callback exactly-once semantics
- Budget enforcement boundary conditions
- Available slots boundary tests (no negative/overflow)
- High-priority task ordering verification

PTYTask tests:
- Actual behavior tests for shouldRead/shouldWrite predicates
- Pause state affects read/write correctly
- State transition tests (rapid pause/unpause cycles)
- Edge cases (nil delegate, concurrent pause changes)

Also updates run_fairness_tests.sh to include new test classes.
Synchronize all FairnessScheduler state access through the mutation queue
to prevent data races when methods are called from multiple threads:

- register(): sync dispatch (returns session ID)
- unregister(): async dispatch
- sessionDidEnqueueWork(): async dispatch
- ensureExecutionScheduled(): dispatch to mutationQueue instead of main
- executeTurn completion: dispatch back to mutationQueue

Update tests to wait for async unregister operations to complete before
asserting on cleanup state.
Replace placeholder "verified by implementation" tests with real behavioral
assertions using MockTaskNotifierTask, an Objective-C mock that implements
the iTermTask protocol.

Changes:
- Add MockTaskNotifierTask.h/.m in sources/ for iTermTask protocol conformance
- Add MockTaskNotifierTask+Swift.swift for Swift convenience API
- Add mock to iTerm2SharedARC-Bridging-Header.h for Swift visibility
- Add ITERM_DEBUG to all 4 ModernTests build configurations
- Update TaskNotifierDispatchSourceTests.swift with real assertions:
  - testDispatchSourceTaskSkipsFdSet: verifies processReadCallCount == 0
  - testLegacyTaskProcessReadCalledBySelect: verifies processReadCallCount > 0
  - testMixedDispatchSourceAndSelectTasks: verifies both paths work together

All 13 TaskNotifier dispatch source tests now pass with real behavior verification.
Replace method-existence tests with actual behavioral verification:
- testSetupCreatesSourcesWhenFdValid: Creates pipe, sets fd, verifies
  sources exist and pause suspends read source
- testTeardownCleansUpSources: Verifies sources are nil after teardown
- testSetPausedTogglesSourceSuspendState: Verifies pause/unpause cycle
  properly suspends/resumes dispatch sources
- testWriteBufferDidChangeWakesWriteSource: Tests write buffer mechanism
- testBackpressureHeavySuspendsReadSource: Verifies heavy backpressure
  suspends read source

Add test hooks to PTYTask (in ITERM_DEBUG block):
- testSetFd: Set fd for testing (bypasses readonly property)
- testWriteBufferHasData: Check if write buffer has data
- testSetupDispatchSourcesForTesting: Manually trigger setup
- testTeardownDispatchSourcesForTesting: Manually trigger teardown
- testAppendDataToWriteBuffer: Add data to write buffer

Tests now use real pipes via createTestPipe() from TestUtilities.swift
and verify actual dispatch source state changes via debug properties.
Replace direct FairnessScheduler.register/unregister calls with tests
that exercise the actual integration points in VT100ScreenMutableState:

Registration tests now:
- Create real VT100ScreenMutableState instances
- Verify registration happens in VT100ScreenMutableState.init (line 129)
- Verify fairnessSessionId is set on tokenExecutor after init

Unregistration tests now:
- Call actual setTerminalEnabled:NO (line 219)
- Verify unregister is called before delegate is cleared (lines 217-223)
- Verify cleanupForUnregistration is invoked via the real path

Add MockSideEffectPerformer to enable creating VT100ScreenMutableState
instances for testing without full PTYSession setup.

Note: setTerminalEnabled: has early return if value unchanged, so tests
call terminalEnabled=true before terminalEnabled=false.
Add two tests to FairnessSchedulerTurnExecutionTests that verify the
scheduler will not schedule overlapping turns for the same executor when
completion is delayed. This is a key safety property for the mutation
queue model.

testNoOverlappingTurnsWhenCompletionDelayed:
- Delays completion callback while enqueueing more work
- Verifies no second executeTurn call until first completion fires
- Confirms isExecuting flag prevents concurrent execution

testWorkArrivedWhileExecutingIsPreserved:
- Verifies workArrivedWhileExecuting flag causes re-scheduling
- Even when turn completes with .completed, new work triggers next turn

These tests prove the scheduler serialization guarantee: a session
will never have executeTurn called while a previous turn completion
callback is still pending.
Update TokenExecutorBudgetEdgeCaseTests and TokenExecutorBudgetEnforcementDetailedTests
to properly validate group-boundary budget semantics using high-priority vs
normal-priority tokens to create guaranteed separate groups.

Key semantics now tested:
- Budget is checked BETWEEN groups, not within a group
- First group always executes (progress guarantee), even if it exceeds budget
- Second group does NOT execute if budget was exceeded by first group
- Groups are atomic - never split mid-execution

Strategy: High-priority tokens go to queues[0], normal-priority to queues[1].
enumerateTokenArrayGroups processes queues in order, guaranteeing two separate
groups that cannot be coalesced.

testSecondGroupSkippedWhenBudgetExceeded now verifies:
- willExecuteCount increments by exactly 1 after first turn (only first group)
- Result is .yielded (second group pending)
- Second turn processes remaining group
- Total of 2 executions (one per group)
Replace timing-based tests that used DispatchQueue.main.asyncAfter with
small timeouts, which are inherently flaky.

FairnessSchedulerTests.swift:
- testCompletedResultDoesNotReaddWithoutNewWork: Use waitForMutationQueue()
- testBlockedResultDoesNotReaddToBusyList: Use waitForMutationQueue()
- testNoOverlappingTurnsWhenCompletionDelayed: Use waitForMutationQueue()
- testUnregisterDuringExecuteTurn: Use waitForMutationQueue()
- testUnregisterAfterYieldedBeforeNextTurn: Use waitForMutationQueue()

IntegrationTests.swift:
- testTaskUnpausedSchedulesExecution: Use onWillExecute callback expectation
- testShortcutNavigationCompleteSchedulesExecution: Same pattern
- testTerminalEnabledSchedulesExecution: Same pattern
- testCopyModeExitSchedulesExecution: Same pattern

TokenExecutorFairnessTests.swift:
- Add onWillExecute callback to MockTokenExecutorDelegate for test hooks

Tests are now deterministic, faster, and more reliable by using proper
synchronization (queue flush or expectation callbacks) instead of
arbitrary timing delays.
- Add testShouldWriteOverride property to bypass jobManager.isReadOnly
  constraint for testing write source resume behavior
- Add testWaitForIOQueue method for deterministic test synchronization,
  replacing flaky Thread.sleep calls with proper queue waiting
- Add PTYTaskReadHandlerPipelineTests with 4 tests verifying:
  - Read source triggers threadedReadTask delegate callback
  - Read handler completes quickly (non-blocking verification)
  - Multiple reads accumulate correctly
  - Full pipeline to TokenExecutor enqueue works
- Fix write source tests to properly verify suspend→resume cycle
- Update MockTokenExecutorDelegate stub to document real implementation
- Register PTYTaskReadHandlerPipelineTests in test runner script
Tests gated behind #if ITERM_DEBUG with meaningless #else fallbacks
(like responds(to:) checks) become no-ops in non-debug configs.

Changes:
- Added isDebugBuild constant to TestUtilities.swift for runtime detection
- Updated testSetupCreatesSourcesWhenFdValid to use XCTSkipUnless
- Updated testTeardownCleansUpSources to use XCTSkipUnless
- Updated testRegisterOnInit to use XCTSkipUnless
- Updated testRespondsToSelectorCheckUsed to use XCTSkipUnless

This makes test requirements explicit: tests that need debug hooks are
skipped rather than silently becoming no-ops. The same pattern should
be applied to other tests with similar fallback blocks.
…sionsWithTokens

The key regression from removing activeSessionsWithTokens is that background
sessions could be starved when foreground sessions are busy. Added tests to
verify the fairness model ensures background sessions still get turns.

Unit tests (TokenExecutorFairnessTests.swift):
- testBackgroundGetsturnsWhileForegroundContinuouslyBusy: Simulates a
  foreground session receiving continuous data while background has pending
  tokens. Verifies background still gets execution turns.

Integration tests (IntegrationTests.swift):
- IntegrationBackgroundForegroundFairnessTests: New test class with:
  - testBackgroundSessionExecutesWhileForegroundBusy: Full stack test with
    VT100ScreenMutableState proving fairness works end-to-end
  - testMultipleBackgroundSessionsAllGetTurns: Verifies fairness across
    multiple background sessions

These tests verify the key guarantee: under the fairness model, ALL sessions
get round-robin turns regardless of foreground/background status.
The previous tests only proved that sessions eventually executed, not that
the round-robin fairness invariant held (each session gets at most one turn
before any session gets a second turn).

Changes:
- Add _testExecutionHistory to FairnessScheduler to record session execution
  order
- Add testGetExecutionHistory(), testGetAndClearExecutionHistory(), and
  testClearExecutionHistory() test hooks
- Rewrite fairness tests to be deterministic (no polling/timeouts):
  - Block execution using shouldQueueTokens/taskPaused
  - Add tokens to all sessions while blocked
  - Unblock and sync mutation queue
  - Verify execution history shows proper round-robin order
- Tests now verify the actual invariant: no session executes consecutively
  when other sessions have work
Tests should not use timeouts as pass/fail criteria for behavior invariants.
Timeouts are only appropriate for detecting hangs (e.g., I/O tests) or
testing actual timing requirements.

Changes:
- Replace waitForCondition(timeout) with multiple waitForMutationQueue() or
  waitForMainQueue() calls to deterministically drain queues
- Use blocking pattern (shouldQueueTokens=true) where needed to control
  execution order
- Convert assertions from "did it happen within timeout?" to "did it happen?"

The PTYTask I/O tests retain timeouts since they test actual kernel dispatch
source behavior where timeouts guard against hangs.
The old test only verified that FairnessScheduler passes a positive budget
(which it always does - defaultTokenBudget=500). It did not test the progress
guarantee or zero-budget behavior.

Changes:
- Rename old test to testSchedulerProvidesPositiveBudget (what it actually tests)
- New testZeroBudgetBehavior verifies the progress guarantee:
  - First group always executes even if it exceeds budget
  - Session yields and gets another turn to complete remaining work
  - This ensures forward progress even with large token groups
…ehavior

The old test manually invoked backpressureReleaseHandler instead of driving
the actual heavy→non-heavy transition via token consumption.

New test:
1. Drives executor to heavy backpressure (>75% slots consumed)
2. Sets up handler to track calls
3. Unblocks execution and lets tokens be consumed
4. Verifies handler was called when crossing out of heavy backpressure
5. Verifies backpressure level is reduced after consumption

This tests the real callback wiring and threshold transition logic.
Tests that spawn background threads accessing self.executor or self.scheduler
could race with tearDown deallocation, causing ASan crashes.

Fix: Capture the object in a local variable before spawning threads,
ensuring the closure holds a strong reference that survives tearDown.

Fixed tests:
- TokenExecutorNonBlockingTests.testAddTokensDoesNotBlock
- TokenExecutorNonBlockingTests.testSemaphoreNotCreated
- FairnessSchedulerThreadSafetyTests.testConcurrentRegistration
- FairnessSchedulerThreadSafetyTests.testConcurrentUnregistration
- FairnessSchedulerThreadSafetyTests.testConcurrentEnqueueWork
- FairnessSchedulerThreadSafetyTests.testConcurrentRegisterAndUnregister
- FairnessSchedulerThreadSafetyTests.testConcurrentEnqueueAndUnregister
- FairnessSchedulerLifecycleEdgeCaseTests.testUnregisterDuringExecuteTurn
The test was just a no-crash check - it didn't verify that shouldRead
returns false when backpressure is blocked.

Fix: Use testIoAllowedOverride to bypass jobManager requirement, then
assert shouldRead is false when backpressure level is blocked.
The test only checked initial .none levels for two executors. It never
drove one executor to heavy/blocked backpressure and verified the other
remained unaffected.

Fix: Add tokens to executor1 to create blocked backpressure, then assert
executor2 remains at .none - verifying per-session backpressure isolation.
…round thread

The test gated the core assertion on !Thread.isMainThread, but XCTest
runs tests on the main thread by default, so the assertion never executed.

Fix: Run the test logic from a background thread via DispatchQueue.global()
and use an expectation to wait for completion. This ensures the non-blocking
assertion is actually executed and verified.
- run_fairness_tests.sh fails if pre-existing crash reports are found
  (they would trigger UKCrashReporter dialog and block automation)
- Checks for new crash reports after tests complete
- Detects 'Program crashed' in test output
- Returns non-zero exit code if crashes detected
- Document crash report locations and UKCrashReporter behavior
- Document ASan crashes bypass internal logging
- Document test race condition pattern with background threads
- Project instructions require asking user before deleting crash reports
The test was a noop - it just created executors and asserted they weren't nil.

Fix: Use Objective-C runtime to verify that TokenExecutor doesn't have
activeSessionsWithTokens class or instance method, confirming the legacy
preemption mechanism was actually removed.
The test only checked total length but didn't use the executedLengths
array it set up, and didn't actually assert ordering.

Fix:
- Use the executedLengths array to verify callback was invoked
- Add assertion that callback reported correct total length
- Clarify in comments that ordering is verified by TwoTierTokenQueueTests
- This test verifies integration: both priority levels execute correctly
…en count

The test was using group.lengthTotal (byte length = 100) as if it were
token count, which was confusing and inaccurate. Since we supply the
input with a known constant (tokenCount: 10), use that constant directly
rather than recalculating from the group. This avoids circular dependency
on token counting logic and keeps the test focused on budget enforcement
at group boundaries.
Add PTYSessionBackpressureWiringTests to verify the critical wiring path
where PTYSession.taskDidRegister sets up the tokenExecutor's
backpressureReleaseHandler to call PTYTask.updateReadSourceState.

Tests verify:
- Handler is set after taskDidRegister
- Task's tokenExecutor property is set
- Handler invocation calls updateReadSourceState
- Handler uses weak reference to task (no retain cycle)

Also:
- Add SpyPTYTask mock for tracking updateReadSourceState calls
- Make updateReadSourceState public in PTYTask.h for testability
- Add PTYSessionBackpressureWiringTests to test runner
IntegrationTests.swift:
- Replace waitForCondition polling with deterministic queue sync loops
- Replace magic 0..<20 loops with waitForSchedulerQuiescence()
- Fix 10 tests that previously relied on arbitrary timeouts

TestUtilities.swift:
- Add waitForSchedulerQuiescence(maxIterations:) - iteration-based quiescence
- Add FairnessScheduler.waitForQuiescenceIterative() extension

FairnessScheduler.swift:
- Add dispatchPrecondition to register() to catch deadlock-prone patterns
  (calling sync from within mutation queue would deadlock)
FairnessSchedulerTests.swift:
- Convert 6 concurrent/stress tests to bounded-progress assertions
- Remove arbitrary timeouts (10-30s) replaced with group.wait() + queue drain
- testConcurrentRegistration kept as watchdog with 60s timeout
- testManySessionsStressTest uses iteration-capped loop instead of polling

Tests now complete in < 1s each vs previous 10-30s timeout waits.
PTYTask.m: Fix race condition in dispatch source teardown where suspend
state could change between capture and async teardown block execution.
Now clears source ivars first to prevent new state updates, then either
tears down inline (if on ioQueue) or uses dispatch_sync to capture
consistent state before async teardown.

TaskNotifier coprocess tests: Simplified to structural no-ops pending
architectural review of select()/dispatch_source interaction for
coprocess handling.

Test improvements:
- Replace timing-based waits with iteration-based loops for determinism
- Add MockCoprocess infrastructure for future coprocess testing
- Enhance MockTaskNotifierTask with coprocess callback tracking
The existing test only drove to .blocked (availableSlots <= 0) but never
verified suspend behavior at .heavy with availableSlots > 0.

Split into two tests:
- testBackpressureHeavyWithPositiveSlotsStillSuspendsReadSource: Tests the
  'heavy but not blocked' case (ratio < 0.25, availableSlots > 0). Verifies
  shouldRead's 'backpressureLevel < .heavy' gate works correctly.
- testBackpressureBlockedSuspendsReadSource: Tests the blocked case
  (availableSlots <= 0) which the original test was actually exercising.
The tests for non-blocking addTokens behavior didn't block consumption,
so a semaphore-based implementation could still pass if tokens drained
fast enough. Now both tests:

1. Set shouldQueueTokens=true to block consumption
2. Add more tokens than buffer capacity (100 > 40 slots)
3. Verify completion without blocking
4. Assert backpressure reached .blocked level (proving test conditions met)

This ensures the tests would fail with the old semaphore-based blocking
implementation where addTokens would wait for permits that never come.
This is the critical test for the TaskNotifier starvation fix. The test
proves that when session A is backpressured, session B can still read
data - the core behavior change from the old select() model.

Test setup:
1. Two PTYTasks with real pipes and dispatch sources
2. Drive session A to blocked backpressure (suspending its read source)
3. Write data to BOTH sessions' pipes
4. Verify session B receives data (dispatch source fires independently)
5. Verify session A does NOT receive data (read source suspended)

In the old TaskNotifier select() model, both sessions would have blocked
because they shared the same select loop. With dispatch_source, each
session reads independently.
The hybrid architecture keeps coprocess FDs on select() while PTY FDs use
dispatch_source. However, the data flow between them was broken:

1. PTY output wasn't routed to coprocess - handleReadEvent only called
   threadedReadTask but not writeToCoprocess (unlike the select() path)

2. Coprocess output wasn't triggering PTY writes - writeTask:coprocess:
   called unblock (for select) but not writeBufferDidChange (for dispatch)

Fixes:
- Add writeToCoprocess call to handleReadEvent for PTY→coprocess flow
- Add writeBufferDidChange call to writeTask:coprocess: for coprocess→PTY

Also updates tests to remove TODO comments about "fundamental conflicts"
since the hybrid approach is now properly implemented.
Fix deadlock in writeTask:coprocess: where writeBufferDidChange was
called while holding writeLock. Since writeBufferDidChange calls
shouldWrite which also acquires writeLock (non-recursive NSLock),
this caused a deadlock. Move writeBufferDidChange outside the lock.

Add end-to-end behavioral tests for coprocess data flow bridge:
- testHandleReadEventRoutesToCoprocess: PTY → coprocess direction
- testCoprocessOutputRoutesToPTY: coprocess → PTY direction
- testWriteTaskTriggersWriteSource: write source mechanics

Add testWriteFromCoprocess: method to PTYTask (Testing) category
to expose the coprocess → PTY path for testing.
Fixed a race condition in teardownDispatchSources where dispatch_async
allowed the PTYTask to be deallocated while cleanup blocks were pending.
Changed to dispatch_sync to ensure cleanup completes before returning.

Split PTYTaskDispatchSourceTests.swift (2453 lines) into 4 focused files:
- PTYTaskDispatchSourceBasicTests.swift (lifecycle, protocol, state)
- PTYTaskDispatchSourcePredicateTests.swift (shouldRead/shouldWrite)
- PTYTaskDispatchSourceHandlerTests.swift (read/write handlers)
- PTYTaskDispatchSourceIntegrationTests.swift (end-to-end tests)

Added test coverage for identified gaps:
- Gap 1: Verify TaskNotifier skips processWrite for dispatch source tasks
- Gap 2: Verify teardown is safe with suspended read/write sources
- Gap 6: Verify write path round-trip (writeTask data appears on fd)
Gap 1: Verify TaskNotifier skips processWrite for dispatch source tasks
- testDispatchSourceTaskSkipsProcessWrite
- testLegacyTaskProcessWriteCalledBySelect (inverse verification)

Gap 3: Verify TaskNotifier calls [coprocess write] when coprocess wantToWrite
- testCoprocessWriteFdProcessedBySelect verifies select() drains coprocess.outputBuffer

Gap 4: Verify addTokens notifies scheduler when called from non-mutation queue
- testAddTokensFromBackgroundQueueNotifiesScheduler
- testAddTokensFromMultipleBackgroundQueuesAllExecute

Fix racy assertion in testWriteTaskTriggersWriteSource:
The write dispatch source can drain writeBuffer before the test checks it.
Removed the intermediate assertion; the test still verifies data reaches the pipe.
Remove accidentally committed local development changes:
- Makefile build convenience flags (CODESIGN_FLAGS, JOBS, etc.)
- Rebuilt ThirdParty framework binaries
- Restore CLAUDE.md, README.md, WebExtensionsFramework/.claude/
- Restore last-xcode-version and utilities.tgz
@chall37
Copy link
Copy Markdown
Contributor Author

chall37 commented Feb 12, 2026

Implemented behind feature flag in #580.

@chall37 chall37 closed this Feb 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants