Fix file handle leaks in DFS/DFSU readers and generic functions by Copilot · Pull Request #921 · DHI/mikeio

Copilot · 2026-02-20T14:54:41Z

File handles opened via mikecore were not being closed in several code paths, causing resource exhaustion on Windows for long-running processes or those reading 500+ files.

Leaked handles fixed

_dfs1.py __init__ — Dfs1FileOpen opened to read spatial axis info, stored in self._dfs, never closed
_dfs3.py _read_dfs3_header — Dfs3FileOpen opened to read spatial axis info, never closed (also removed dead self._source assignment)
_dfsu.py _dfs_read_item_time_func — DfsuFile.Open called per-iteration inside extract_track() loop, never closed — most critical leak
generic.py avg_time — input dfs_i handle never closed
generic.py quantile — input dfs_i handle never closed

Pattern

Each fix follows the same approach: close the handle immediately after the data needed from it has been read, matching the convention used elsewhere in the codebase.

# Before (_dfs1.py)
self._dfs = DfsFileFactory.Dfs1FileOpen(str(filename))
self._x0 = self._dfs.SpatialAxis.X0  # handle left open

# After
dfs = DfsFileFactory.Dfs1FileOpen(str(filename))
self._x0 = dfs.SpatialAxis.X0
dfs.Close()

Verified no other leaks

All DfsFileFactory.*Open and DfsuFile.Open calls across the codebase were audited, including Dfs2 which already used the correct pattern (local variable + Close()). No additional leaks were found — all other open calls have matching .Close() calls either directly or via write_dfs_data/write_dfsu_data.

Tests

Added file handle leak tests to existing test files, covering all DFS types:

test_dfs1.py: test_count_open_fds_sanity_check — verifies that /proc/self/fd counting actually detects mikecore file handles (opening increases count, closing restores it); test_dfs1_init_closes_file_handle — creates 50 Dfs1 instances while holding references; with the old code each instance would leak a file descriptor, with the fix zero are leaked
test_dfs2.py: test_dfs2_init_closes_file_handle — verifies Dfs2.__init__ does not leak; test_dfs2_read_closes_file_handle — verifies Dfs2.read() does not leak
test_dfs3.py: test_dfs3_init_closes_file_handle — same pattern for Dfs3
test_dfsu2dh.py: test_dfsu_read_closes_file_handle — creates and reads 50 Dfsu files while holding references to the results

Tests hold references to created instances to prevent Python's garbage collector from masking leaks via __del__, ensuring the leak is detected deterministically.

Original prompt

This section details on the original issue you should resolve

<issue_title>File handles are maybe not being released</issue_title>
<issue_description>I think we need to investigate if there's file handles being leaked based on this issue:

DHI/mikecore-python#41

It seems the handles are living on long after data has been read, which is causing problems on Windows for long-running processes or ones which read 500+ files in a single process.</issue_description>

<agent_instructions>Fix this bug</agent_instructions>

Comments on the Issue (you are @copilot in this section)
@ecomodeller Related: #820 (memory leak when reading dfsu many times)

Fixes File handles are maybe not being released #896

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Fix file handle leaks in: - _dfs1.py: close Dfs1FileOpen handle after reading spatial info in __init__ - _dfs3.py: close Dfs3FileOpen handle after reading header info - _dfsu.py: close DfsuFile handle in _dfs_read_item_time_func - generic.py: close input file handle in avg_time() and quantile() Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>

ryan-kipawa

Prove this fixes the bug by first exposing it with a failing test. There's already a PR upstream that shows how to expose it on Windows, available here: DHI/mikecore-python#42

ryan-kipawa · 2026-02-20T15:12:45Z

+    for _ in range(50):
+        mikeio.Dfs1("tests/testdata/random.dfs1")
+    gc.collect()
+    assert _count_open_fds() - baseline == 0


Check that _count_open_fds() is actually working as expected

Added test_count_open_fds_sanity_check in 8f56eff which verifies that _count_open_fds() detects mikecore file handles: opening a DfsFile increases the FD count by 1, and closing it restores it. Also updated the other tests to hold references to the created instances (preventing GC from masking the leak) and removed gc.collect() before assertions so leaks are caught deterministically.

ryan-kipawa · 2026-02-20T15:25:53Z

@copilot

- Add sanity check test verifying _count_open_fds() detects mikecore file handles (open increases count, close restores it) - Hold references to instances in tests so GC cannot mask the leak - Remove gc.collect() before assertions to catch leaks deterministically - Add docstrings explaining how each test exposes the original bug Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>

Copilot · 2026-02-20T15:30:27Z

Prove this fixes the bug by first exposing it with a failing test. There's already a PR upstream that shows how to expose it on Windows, available here: DHI/mikecore-python#42

Updated the tests in 8f56eff to properly expose the bug. Each test now holds references to the created instances (e.g. instances.append(mikeio.Dfs1(...))), preventing GC from closing leaked handles. With the old code (self._dfs = DfsFileFactory.Dfs1FileOpen(...) never closed), 50 instances would leak 50 file descriptors; with the fix, 0 are leaked. Also added a sanity check test confirming that /proc/self/fd counts actually detect mikecore handles.

Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>

ryan-kipawa · 2026-02-20T16:23:19Z

@copilot Review this

Copilot · 2026-02-20T16:26:21Z

@copilot Review this

I've reviewed all the changes. Here's the summary:

Source fixes (all correct):

_dfs1.py __init__: Changed self._dfs to local dfs, added dfs.Close(). The _open() method still correctly sets self._dfs for read() which closes it.
_dfs3.py _read_dfs3_header: Same pattern. Removed dead self._source = self._dfs (no readers of self._source exist in this file).
_dfsu.py _dfs_read_item_time_func: Added dfs.Close() — this was the most critical leak since it's called per-item-per-timestep inside extract_track().
generic.py avg_time/quantile: Added dfs_i.Close() after all reads complete, before writing output.

No additional leaks found. I checked all other DfsFileFactory.*Open/DfsuFile.Open calls across the codebase — they all have matching .Close() calls (either directly or via write_dfs_data/write_dfsu_data).

Tests verify the fix: The sanity check confirms /proc/self/fd detects mikecore handles, and the leak tests hold references to prevent GC from masking the issue. All 204 related tests + 4 new tests pass.

ryan-kipawa · 2026-02-20T16:50:56Z

@copilot You have removed a private instance variable that previously held an open file handle, and replaced it with local file handles with matching open and close methods. Why did this exist as an instance variable in the original codebase? Double check that nothings breaks because of this (e.g. potential logical errors not covered by the test suite).

Copilot · 2026-02-20T16:54:34Z

@copilot You have removed a private instance variable that previously held an open file handle, and replaced it with local file handles with matching open and close methods. Why did this exist as an i...

I investigated this thoroughly. Here's why self._dfs existed and why the change is safe:

Why it existed: The original Dfs1/Dfs3 __init__ called super().__init__() which opens a DfsGenericOpen handle, saves it to self._dfs, reads metadata, then closes it (line 344). The subclass then overwrote self._dfs with a new type-specific handle (Dfs1FileOpen/Dfs3FileOpen) that was never closed — this was the leak. The overwrite appears unintentional since Dfs2.__init__ already used the correct pattern (local variable + Close()).

Why nothing breaks:

self._dfs still exists — it points to the base class's closed DfsGenericOpen handle (set at line 305). mikecore retains metadata in memory after Close().
timestep property (line 431) — accesses self._dfs.FileInfo.TimeAxis.TimeStepInSeconds() on the closed handle. Works correctly (verified: Dfs1=12.0, Dfs2=3600.0, Dfs3=30.0).
_get_item_info (line 382) — only called during __init__ at line 308, before Close() at line 344.
read() methods — each calls _open() which sets self._dfs to a fresh handle, then closes it after reading. Unaffected by __init__ changes.
self._source in Dfs3 — was dead code (assigned but never read anywhere in the codebase). Dfs2._open() has the same dead assignment at line 288 but I left it since it's not in scope.

Verified all 208 related tests pass, plus manual testing of timestep, end_time, items, and multiple sequential read() calls.

ryan-kipawa · 2026-02-20T20:26:00Z

@copilot Suggest alternatives for making the codebase more resilient to resource leaks in order to proactively reduce maintenance burden.

Copilot · 2026-02-20T20:27:48Z

@copilot Suggest alternatives for making the codebase more resilient to resource leaks in order to proactively reduce maintenance burden.

Here are three alternatives, ordered by impact-to-effort ratio:

1. Add context manager support to mikecore (upstream fix)
Neither DfsFile nor DfsuFile implement __enter__/__exit__. Adding this upstream would enable with blocks throughout mikeio, making leaks structurally impossible:

# Instead of manual open/close pairs:
with DfsuFile.Open(self._filename) as dfs:
    itemdata = dfs.ReadItemTimeStep(item + 1, step)

This is the highest-leverage change since it prevents the entire class of bugs. A PR to mikecore would be straightforward (3-4 lines per class).

2. Add a try/finally wrapper in mikeio
Until mikecore has context manager support, mikeio could add a thin contextlib.contextmanager wrapper:

@contextmanager
def _open_dfs(open_func, *args):
    dfs = open_func(*args)
    try:
        yield dfs
    finally:
        dfs.Close()

This could be adopted incrementally — generic.py alone has ~20 open/close pairs across functions like concat, extract, scale, avg_time, and quantile where an exception between Open and Close would leak a handle.

3. Add a lint rule or test to catch regressions
A static analysis check (e.g. a custom ruff rule or a grep-based CI check) that flags DfsFileFactory.*Open or DfsuFile.Open calls where the result is assigned to self.* without a Close() in the same scope. Alternatively, expand the FD-counting test to cover more code paths (e.g. generic.avg_time, extract_track).

ryan-kipawa · 2026-02-20T20:33:12Z

@copilot Implement alternative two in a separate PR that is branches off this one. Your implementation should make it easy in the future to switch to alternative one when it's been added upstream.

Copilot · 2026-02-20T20:34:00Z

@copilot Implement alternative two in a separate PR that is branches off this one. Your implementation should make it easy in the future to switch to alternative one when it's been added upstream.

I'm unable to create separate branches or open new PRs from this environment. I can implement the context manager wrapper in this PR if you'd like, or you can open a new branch off copilot/fix-file-handle-leak and I can implement it there once assigned the issue.

The implementation plan would be:

Add _open_dfs context manager in src/mikeio/dfs/_dfs.py that wraps any mikecore open call with try/finally: dfs.Close()
Design it so switching to upstream context manager support (alternative one) is a single-line change — just replace the wrapper body with yield open_func(*args) when mikecore adds __enter__/__exit__
Adopt it in the files touched by this PR first (_dfs1.py, _dfs3.py, _dfsu.py, generic.py)

ryan-kipawa · 2026-02-22T17:50:27Z

@copilot Put your newly created tests into appropriate existing test files.

Moved tests from test_file_handle_close.py into their corresponding existing test files: - test_count_open_fds_sanity_check and test_dfs1_init_closes_file_handle → test_dfs1.py - test_dfs3_init_closes_file_handle → test_dfs3.py - test_dfsu_read_closes_file_handle → test_dfsu2dh.py Deleted the standalone test_file_handle_close.py. Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>

Copilot · 2026-02-22T17:53:30Z

@copilot Put your newly created tests into appropriate existing test files.

Done in c33343e. Moved the tests into their corresponding existing test files:

test_count_open_fds_sanity_check + test_dfs1_init_closes_file_handle → test_dfs1.py
test_dfs3_init_closes_file_handle → test_dfs3.py
test_dfsu_read_closes_file_handle → test_dfsu2dh.py

Deleted test_file_handle_close.py.

ryan-kipawa · 2026-02-23T08:39:28Z

@copilot You previously double checked your work to make sure all open calls have matching close calls. The purpose of this was to not leak file handles.

No additional leaks found. I checked all other DfsFileFactory.*Open/DfsuFile.Open calls across the codebase — they all have matching .Close() calls (either directly or via write_dfs_data/write_dfsu_data).

Please check once more, with particular focus on dfs2 files.

Added test_dfs2_init_closes_file_handle and test_dfs2_read_closes_file_handle to verify Dfs2.__init__ and Dfs2.read() don't leak file handles. Both pass — Dfs2 already used the correct pattern (local variable + Close()). Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>

Copilot · 2026-02-23T08:44:11Z

@copilot You previously double checked your work to make sure all open calls have matching close calls. The purpose of this was to not leak file handles.

No additional leaks found. I checked ...

Rechecked all DfsFileFactory.*Open and DfsuFile.Open calls across the codebase with particular focus on dfs2. Results:

Dfs2 is already correct — __init__ uses a local dfs variable and calls dfs.Close() at line 154; read() calls _open() which sets self._dfs, then closes it at line 235; append() passes the handle to write_dfs_data() which closes it internally.

I also ran a comprehensive FD-counting test across all file types (Dfs0, Dfs1, Dfs2, Dfs3, Dfsu2DH) for both __init__ and read() — zero leaks in all cases.

Added test_dfs2_init_closes_file_handle and test_dfs2_read_closes_file_handle to test_dfs2.py in 0c4ebba to cover Dfs2 explicitly.

Initial plan

41e8275

Copilot AI assigned Copilot and ryan-kipawa Feb 20, 2026

Copilot started work on behalf of ryan-kipawa February 20, 2026 14:54 View session

Copilot AI changed the title ~~[WIP] Fix potential file handle leaks in long-running processes~~ Fix file handle leaks in DFS/DFSU readers and generic functions Feb 20, 2026

Copilot AI requested a review from ryan-kipawa February 20, 2026 15:06

Copilot finished work on behalf of ryan-kipawa February 20, 2026 15:06

ryan-kipawa reviewed Feb 20, 2026

View reviewed changes

Copilot started work on behalf of ryan-kipawa February 20, 2026 15:26 View session

Use try-finally in sanity check test to ensure file handle is closed

7b1b10e

Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>

Copilot AI requested a review from ryan-kipawa February 20, 2026 15:32

Copilot finished work on behalf of ryan-kipawa February 20, 2026 15:32

Copilot started work on behalf of ryan-kipawa February 20, 2026 16:23 View session

Copilot finished work on behalf of ryan-kipawa February 20, 2026 16:26

Copilot started work on behalf of ryan-kipawa February 20, 2026 16:51 View session

Copilot finished work on behalf of ryan-kipawa February 20, 2026 16:55

Copilot started work on behalf of ryan-kipawa February 20, 2026 20:26 View session

Copilot finished work on behalf of ryan-kipawa February 20, 2026 20:28

Copilot started work on behalf of ryan-kipawa February 20, 2026 20:33 View session

Copilot finished work on behalf of ryan-kipawa February 20, 2026 20:34

Copilot started work on behalf of ryan-kipawa February 22, 2026 17:50 View session

Copilot finished work on behalf of ryan-kipawa February 22, 2026 17:54

Copilot started work on behalf of ryan-kipawa February 23, 2026 08:39 View session

Copilot finished work on behalf of ryan-kipawa February 23, 2026 08:45

ryan-kipawa approved these changes Feb 23, 2026

View reviewed changes

ryan-kipawa marked this pull request as ready for review February 23, 2026 08:48

ryan-kipawa requested a review from ecomodeller February 23, 2026 08:49

Conversation

Copilot AI commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Leaked handles fixed

Pattern

Verified no other leaks

Tests

Comments on the Issue (you are @copilot in this section)

Uh oh!

ryan-kipawa left a comment

Choose a reason for hiding this comment

Uh oh!

ryan-kipawa Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

ryan-kipawa commented Feb 20, 2026

Uh oh!

Copilot AI commented Feb 20, 2026

Uh oh!

ryan-kipawa commented Feb 20, 2026

Uh oh!

Copilot AI commented Feb 20, 2026

Uh oh!

ryan-kipawa commented Feb 20, 2026

Uh oh!

Copilot AI commented Feb 20, 2026

Uh oh!

ryan-kipawa commented Feb 20, 2026

Uh oh!

Copilot AI commented Feb 20, 2026

Uh oh!

ryan-kipawa commented Feb 20, 2026

Uh oh!

Copilot AI commented Feb 20, 2026

Uh oh!

ryan-kipawa commented Feb 22, 2026

Uh oh!

Copilot AI commented Feb 22, 2026

Uh oh!

ryan-kipawa commented Feb 23, 2026

Uh oh!

Copilot AI commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Feb 20, 2026 •

edited

Loading