Skip to content

Fix file handle leaks in DFS/DFSU readers and generic functions#921

Open
Copilot wants to merge 6 commits intomainfrom
copilot/fix-file-handle-leak
Open

Fix file handle leaks in DFS/DFSU readers and generic functions#921
Copilot wants to merge 6 commits intomainfrom
copilot/fix-file-handle-leak

Conversation

Copy link
Copy Markdown

Copilot AI commented Feb 20, 2026

File handles opened via mikecore were not being closed in several code paths, causing resource exhaustion on Windows for long-running processes or those reading 500+ files.

Leaked handles fixed

  • _dfs1.py __init__Dfs1FileOpen opened to read spatial axis info, stored in self._dfs, never closed
  • _dfs3.py _read_dfs3_headerDfs3FileOpen opened to read spatial axis info, never closed (also removed dead self._source assignment)
  • _dfsu.py _dfs_read_item_time_funcDfsuFile.Open called per-iteration inside extract_track() loop, never closed — most critical leak
  • generic.py avg_time — input dfs_i handle never closed
  • generic.py quantile — input dfs_i handle never closed

Pattern

Each fix follows the same approach: close the handle immediately after the data needed from it has been read, matching the convention used elsewhere in the codebase.

# Before (_dfs1.py)
self._dfs = DfsFileFactory.Dfs1FileOpen(str(filename))
self._x0 = self._dfs.SpatialAxis.X0  # handle left open

# After
dfs = DfsFileFactory.Dfs1FileOpen(str(filename))
self._x0 = dfs.SpatialAxis.X0
dfs.Close()

Verified no other leaks

All DfsFileFactory.*Open and DfsuFile.Open calls across the codebase were audited, including Dfs2 which already used the correct pattern (local variable + Close()). No additional leaks were found — all other open calls have matching .Close() calls either directly or via write_dfs_data/write_dfsu_data.

Tests

Added file handle leak tests to existing test files, covering all DFS types:

  • test_dfs1.py: test_count_open_fds_sanity_check — verifies that /proc/self/fd counting actually detects mikecore file handles (opening increases count, closing restores it); test_dfs1_init_closes_file_handle — creates 50 Dfs1 instances while holding references; with the old code each instance would leak a file descriptor, with the fix zero are leaked
  • test_dfs2.py: test_dfs2_init_closes_file_handle — verifies Dfs2.__init__ does not leak; test_dfs2_read_closes_file_handle — verifies Dfs2.read() does not leak
  • test_dfs3.py: test_dfs3_init_closes_file_handle — same pattern for Dfs3
  • test_dfsu2dh.py: test_dfsu_read_closes_file_handle — creates and reads 50 Dfsu files while holding references to the results

Tests hold references to created instances to prevent Python's garbage collector from masking leaks via __del__, ensuring the leak is detected deterministically.

Original prompt

This section details on the original issue you should resolve

<issue_title>File handles are maybe not being released</issue_title>
<issue_description>I think we need to investigate if there's file handles being leaked based on this issue:

DHI/mikecore-python#41

It seems the handles are living on long after data has been read, which is causing problems on Windows for long-running processes or ones which read 500+ files in a single process.</issue_description>

<agent_instructions>Fix this bug</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@ecomodeller Related: #820 (memory leak when reading dfsu many times)

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Fix file handle leaks in:
- _dfs1.py: close Dfs1FileOpen handle after reading spatial info in __init__
- _dfs3.py: close Dfs3FileOpen handle after reading header info
- _dfsu.py: close DfsuFile handle in _dfs_read_item_time_func
- generic.py: close input file handle in avg_time() and quantile()

Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix potential file handle leaks in long-running processes Fix file handle leaks in DFS/DFSU readers and generic functions Feb 20, 2026
Copilot AI requested a review from ryan-kipawa February 20, 2026 15:06
Copy link
Copy Markdown
Contributor

@ryan-kipawa ryan-kipawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prove this fixes the bug by first exposing it with a failing test. There's already a PR upstream that shows how to expose it on Windows, available here: DHI/mikecore-python#42

Comment thread tests/test_file_handle_close.py Outdated
for _ in range(50):
mikeio.Dfs1("tests/testdata/random.dfs1")
gc.collect()
assert _count_open_fds() - baseline == 0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check that _count_open_fds() is actually working as expected

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test_count_open_fds_sanity_check in 8f56eff which verifies that _count_open_fds() detects mikecore file handles: opening a DfsFile increases the FD count by 1, and closing it restores it. Also updated the other tests to hold references to the created instances (preventing GC from masking the leak) and removed gc.collect() before assertions so leaks are caught deterministically.

@ryan-kipawa
Copy link
Copy Markdown
Contributor

@copilot

- Add sanity check test verifying _count_open_fds() detects mikecore
  file handles (open increases count, close restores it)
- Hold references to instances in tests so GC cannot mask the leak
- Remove gc.collect() before assertions to catch leaks deterministically
- Add docstrings explaining how each test exposes the original bug

Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>
Copy link
Copy Markdown
Author

Copilot AI commented Feb 20, 2026

Prove this fixes the bug by first exposing it with a failing test. There's already a PR upstream that shows how to expose it on Windows, available here: DHI/mikecore-python#42

Updated the tests in 8f56eff to properly expose the bug. Each test now holds references to the created instances (e.g. instances.append(mikeio.Dfs1(...))), preventing GC from closing leaked handles. With the old code (self._dfs = DfsFileFactory.Dfs1FileOpen(...) never closed), 50 instances would leak 50 file descriptors; with the fix, 0 are leaked. Also added a sanity check test confirming that /proc/self/fd counts actually detect mikecore handles.

Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>
@ryan-kipawa
Copy link
Copy Markdown
Contributor

@copilot Review this

Copy link
Copy Markdown
Author

Copilot AI commented Feb 20, 2026

@copilot Review this

I've reviewed all the changes. Here's the summary:

Source fixes (all correct):

  • _dfs1.py __init__: Changed self._dfs to local dfs, added dfs.Close(). The _open() method still correctly sets self._dfs for read() which closes it.
  • _dfs3.py _read_dfs3_header: Same pattern. Removed dead self._source = self._dfs (no readers of self._source exist in this file).
  • _dfsu.py _dfs_read_item_time_func: Added dfs.Close() — this was the most critical leak since it's called per-item-per-timestep inside extract_track().
  • generic.py avg_time/quantile: Added dfs_i.Close() after all reads complete, before writing output.

No additional leaks found. I checked all other DfsFileFactory.*Open/DfsuFile.Open calls across the codebase — they all have matching .Close() calls (either directly or via write_dfs_data/write_dfsu_data).

Tests verify the fix: The sanity check confirms /proc/self/fd detects mikecore handles, and the leak tests hold references to prevent GC from masking the issue. All 204 related tests + 4 new tests pass.

@ryan-kipawa
Copy link
Copy Markdown
Contributor

@copilot You have removed a private instance variable that previously held an open file handle, and replaced it with local file handles with matching open and close methods. Why did this exist as an instance variable in the original codebase? Double check that nothings breaks because of this (e.g. potential logical errors not covered by the test suite).

Copy link
Copy Markdown
Author

Copilot AI commented Feb 20, 2026

@copilot You have removed a private instance variable that previously held an open file handle, and replaced it with local file handles with matching open and close methods. Why did this exist as an i...

I investigated this thoroughly. Here's why self._dfs existed and why the change is safe:

Why it existed: The original Dfs1/Dfs3 __init__ called super().__init__() which opens a DfsGenericOpen handle, saves it to self._dfs, reads metadata, then closes it (line 344). The subclass then overwrote self._dfs with a new type-specific handle (Dfs1FileOpen/Dfs3FileOpen) that was never closed — this was the leak. The overwrite appears unintentional since Dfs2.__init__ already used the correct pattern (local variable + Close()).

Why nothing breaks:

  1. self._dfs still exists — it points to the base class's closed DfsGenericOpen handle (set at line 305). mikecore retains metadata in memory after Close().

  2. timestep property (line 431) — accesses self._dfs.FileInfo.TimeAxis.TimeStepInSeconds() on the closed handle. Works correctly (verified: Dfs1=12.0, Dfs2=3600.0, Dfs3=30.0).

  3. _get_item_info (line 382) — only called during __init__ at line 308, before Close() at line 344.

  4. read() methods — each calls _open() which sets self._dfs to a fresh handle, then closes it after reading. Unaffected by __init__ changes.

  5. self._source in Dfs3 — was dead code (assigned but never read anywhere in the codebase). Dfs2._open() has the same dead assignment at line 288 but I left it since it's not in scope.

Verified all 208 related tests pass, plus manual testing of timestep, end_time, items, and multiple sequential read() calls.

@ryan-kipawa
Copy link
Copy Markdown
Contributor

@copilot Suggest alternatives for making the codebase more resilient to resource leaks in order to proactively reduce maintenance burden.

Copy link
Copy Markdown
Author

Copilot AI commented Feb 20, 2026

@copilot Suggest alternatives for making the codebase more resilient to resource leaks in order to proactively reduce maintenance burden.

Here are three alternatives, ordered by impact-to-effort ratio:

1. Add context manager support to mikecore (upstream fix)
Neither DfsFile nor DfsuFile implement __enter__/__exit__. Adding this upstream would enable with blocks throughout mikeio, making leaks structurally impossible:

# Instead of manual open/close pairs:
with DfsuFile.Open(self._filename) as dfs:
    itemdata = dfs.ReadItemTimeStep(item + 1, step)

This is the highest-leverage change since it prevents the entire class of bugs. A PR to mikecore would be straightforward (3-4 lines per class).

2. Add a try/finally wrapper in mikeio
Until mikecore has context manager support, mikeio could add a thin contextlib.contextmanager wrapper:

@contextmanager
def _open_dfs(open_func, *args):
    dfs = open_func(*args)
    try:
        yield dfs
    finally:
        dfs.Close()

This could be adopted incrementally — generic.py alone has ~20 open/close pairs across functions like concat, extract, scale, avg_time, and quantile where an exception between Open and Close would leak a handle.

3. Add a lint rule or test to catch regressions
A static analysis check (e.g. a custom ruff rule or a grep-based CI check) that flags DfsFileFactory.*Open or DfsuFile.Open calls where the result is assigned to self.* without a Close() in the same scope. Alternatively, expand the FD-counting test to cover more code paths (e.g. generic.avg_time, extract_track).

@ryan-kipawa
Copy link
Copy Markdown
Contributor

@copilot Implement alternative two in a separate PR that is branches off this one. Your implementation should make it easy in the future to switch to alternative one when it's been added upstream.

Copy link
Copy Markdown
Author

Copilot AI commented Feb 20, 2026

@copilot Implement alternative two in a separate PR that is branches off this one. Your implementation should make it easy in the future to switch to alternative one when it's been added upstream.

I'm unable to create separate branches or open new PRs from this environment. I can implement the context manager wrapper in this PR if you'd like, or you can open a new branch off copilot/fix-file-handle-leak and I can implement it there once assigned the issue.

The implementation plan would be:

  1. Add _open_dfs context manager in src/mikeio/dfs/_dfs.py that wraps any mikecore open call with try/finally: dfs.Close()
  2. Design it so switching to upstream context manager support (alternative one) is a single-line change — just replace the wrapper body with yield open_func(*args) when mikecore adds __enter__/__exit__
  3. Adopt it in the files touched by this PR first (_dfs1.py, _dfs3.py, _dfsu.py, generic.py)

@ryan-kipawa
Copy link
Copy Markdown
Contributor

@copilot Put your newly created tests into appropriate existing test files.

Moved tests from test_file_handle_close.py into their corresponding
existing test files:
- test_count_open_fds_sanity_check and test_dfs1_init_closes_file_handle → test_dfs1.py
- test_dfs3_init_closes_file_handle → test_dfs3.py
- test_dfsu_read_closes_file_handle → test_dfsu2dh.py

Deleted the standalone test_file_handle_close.py.

Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>
Copy link
Copy Markdown
Author

Copilot AI commented Feb 22, 2026

@copilot Put your newly created tests into appropriate existing test files.

Done in c33343e. Moved the tests into their corresponding existing test files:

  • test_count_open_fds_sanity_check + test_dfs1_init_closes_file_handletest_dfs1.py
  • test_dfs3_init_closes_file_handletest_dfs3.py
  • test_dfsu_read_closes_file_handletest_dfsu2dh.py

Deleted test_file_handle_close.py.

@ryan-kipawa
Copy link
Copy Markdown
Contributor

@copilot You previously double checked your work to make sure all open calls have matching close calls. The purpose of this was to not leak file handles.

No additional leaks found. I checked all other DfsFileFactory.*Open/DfsuFile.Open calls across the codebase — they all have matching .Close() calls (either directly or via write_dfs_data/write_dfsu_data).

Please check once more, with particular focus on dfs2 files.

Added test_dfs2_init_closes_file_handle and test_dfs2_read_closes_file_handle
to verify Dfs2.__init__ and Dfs2.read() don't leak file handles. Both pass —
Dfs2 already used the correct pattern (local variable + Close()).

Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>
Copy link
Copy Markdown
Author

Copilot AI commented Feb 23, 2026

@copilot You previously double checked your work to make sure all open calls have matching close calls. The purpose of this was to not leak file handles.

No additional leaks found. I checked ...

Rechecked all DfsFileFactory.*Open and DfsuFile.Open calls across the codebase with particular focus on dfs2. Results:

Dfs2 is already correct__init__ uses a local dfs variable and calls dfs.Close() at line 154; read() calls _open() which sets self._dfs, then closes it at line 235; append() passes the handle to write_dfs_data() which closes it internally.

I also ran a comprehensive FD-counting test across all file types (Dfs0, Dfs1, Dfs2, Dfs3, Dfsu2DH) for both __init__ and read() — zero leaks in all cases.

Added test_dfs2_init_closes_file_handle and test_dfs2_read_closes_file_handle to test_dfs2.py in 0c4ebba to cover Dfs2 explicitly.

@ryan-kipawa ryan-kipawa marked this pull request as ready for review February 23, 2026 08:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

File handles are maybe not being released

2 participants