Fix file handle leaks in DFS/DFSU readers and generic functions#921
Fix file handle leaks in DFS/DFSU readers and generic functions#921
Conversation
Fix file handle leaks in: - _dfs1.py: close Dfs1FileOpen handle after reading spatial info in __init__ - _dfs3.py: close Dfs3FileOpen handle after reading header info - _dfsu.py: close DfsuFile handle in _dfs_read_item_time_func - generic.py: close input file handle in avg_time() and quantile() Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>
ryan-kipawa
left a comment
There was a problem hiding this comment.
Prove this fixes the bug by first exposing it with a failing test. There's already a PR upstream that shows how to expose it on Windows, available here: DHI/mikecore-python#42
| for _ in range(50): | ||
| mikeio.Dfs1("tests/testdata/random.dfs1") | ||
| gc.collect() | ||
| assert _count_open_fds() - baseline == 0 |
There was a problem hiding this comment.
Check that _count_open_fds() is actually working as expected
There was a problem hiding this comment.
Added test_count_open_fds_sanity_check in 8f56eff which verifies that _count_open_fds() detects mikecore file handles: opening a DfsFile increases the FD count by 1, and closing it restores it. Also updated the other tests to hold references to the created instances (preventing GC from masking the leak) and removed gc.collect() before assertions so leaks are caught deterministically.
- Add sanity check test verifying _count_open_fds() detects mikecore file handles (open increases count, close restores it) - Hold references to instances in tests so GC cannot mask the leak - Remove gc.collect() before assertions to catch leaks deterministically - Add docstrings explaining how each test exposes the original bug Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>
Updated the tests in 8f56eff to properly expose the bug. Each test now holds references to the created instances (e.g. |
Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>
|
@copilot Review this |
I've reviewed all the changes. Here's the summary: Source fixes (all correct):
No additional leaks found. I checked all other Tests verify the fix: The sanity check confirms |
|
@copilot You have removed a private instance variable that previously held an open file handle, and replaced it with local file handles with matching open and close methods. Why did this exist as an instance variable in the original codebase? Double check that nothings breaks because of this (e.g. potential logical errors not covered by the test suite). |
I investigated this thoroughly. Here's why Why it existed: The original Dfs1/Dfs3 Why nothing breaks:
Verified all 208 related tests pass, plus manual testing of |
|
@copilot Suggest alternatives for making the codebase more resilient to resource leaks in order to proactively reduce maintenance burden. |
Here are three alternatives, ordered by impact-to-effort ratio: 1. Add context manager support to mikecore (upstream fix) # Instead of manual open/close pairs:
with DfsuFile.Open(self._filename) as dfs:
itemdata = dfs.ReadItemTimeStep(item + 1, step)This is the highest-leverage change since it prevents the entire class of bugs. A PR to mikecore would be straightforward (3-4 lines per class). 2. Add a @contextmanager
def _open_dfs(open_func, *args):
dfs = open_func(*args)
try:
yield dfs
finally:
dfs.Close()This could be adopted incrementally — 3. Add a lint rule or test to catch regressions |
|
@copilot Implement alternative two in a separate PR that is branches off this one. Your implementation should make it easy in the future to switch to alternative one when it's been added upstream. |
I'm unable to create separate branches or open new PRs from this environment. I can implement the context manager wrapper in this PR if you'd like, or you can open a new branch off The implementation plan would be:
|
|
@copilot Put your newly created tests into appropriate existing test files. |
Moved tests from test_file_handle_close.py into their corresponding existing test files: - test_count_open_fds_sanity_check and test_dfs1_init_closes_file_handle → test_dfs1.py - test_dfs3_init_closes_file_handle → test_dfs3.py - test_dfsu_read_closes_file_handle → test_dfsu2dh.py Deleted the standalone test_file_handle_close.py. Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>
Done in c33343e. Moved the tests into their corresponding existing test files:
Deleted |
|
@copilot You previously double checked your work to make sure all open calls have matching close calls. The purpose of this was to not leak file handles.
Please check once more, with particular focus on dfs2 files. |
Added test_dfs2_init_closes_file_handle and test_dfs2_read_closes_file_handle to verify Dfs2.__init__ and Dfs2.read() don't leak file handles. Both pass — Dfs2 already used the correct pattern (local variable + Close()). Co-authored-by: ryan-kipawa <74630349+ryan-kipawa@users.noreply.github.com>
Rechecked all Dfs2 is already correct — I also ran a comprehensive FD-counting test across all file types (Dfs0, Dfs1, Dfs2, Dfs3, Dfsu2DH) for both Added |
File handles opened via mikecore were not being closed in several code paths, causing resource exhaustion on Windows for long-running processes or those reading 500+ files.
Leaked handles fixed
_dfs1.py__init__—Dfs1FileOpenopened to read spatial axis info, stored inself._dfs, never closed_dfs3.py_read_dfs3_header—Dfs3FileOpenopened to read spatial axis info, never closed (also removed deadself._sourceassignment)_dfsu.py_dfs_read_item_time_func—DfsuFile.Opencalled per-iteration insideextract_track()loop, never closed — most critical leakgeneric.pyavg_time— inputdfs_ihandle never closedgeneric.pyquantile— inputdfs_ihandle never closedPattern
Each fix follows the same approach: close the handle immediately after the data needed from it has been read, matching the convention used elsewhere in the codebase.
Verified no other leaks
All
DfsFileFactory.*OpenandDfsuFile.Opencalls across the codebase were audited, including Dfs2 which already used the correct pattern (local variable +Close()). No additional leaks were found — all other open calls have matching.Close()calls either directly or viawrite_dfs_data/write_dfsu_data.Tests
Added file handle leak tests to existing test files, covering all DFS types:
test_dfs1.py:test_count_open_fds_sanity_check— verifies that/proc/self/fdcounting actually detects mikecore file handles (opening increases count, closing restores it);test_dfs1_init_closes_file_handle— creates 50Dfs1instances while holding references; with the old code each instance would leak a file descriptor, with the fix zero are leakedtest_dfs2.py:test_dfs2_init_closes_file_handle— verifiesDfs2.__init__does not leak;test_dfs2_read_closes_file_handle— verifiesDfs2.read()does not leaktest_dfs3.py:test_dfs3_init_closes_file_handle— same pattern forDfs3test_dfsu2dh.py:test_dfsu_read_closes_file_handle— creates and reads 50Dfsufiles while holding references to the resultsTests hold references to created instances to prevent Python's garbage collector from masking leaks via
__del__, ensuring the leak is detected deterministically.Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.