Skip to content

HDF5 parameters for chunking and cache size#1240

Open
MBartkowiakSTFC wants to merge 13 commits into
protosfrom
maciej/hdf5-drivers
Open

HDF5 parameters for chunking and cache size#1240
MBartkowiakSTFC wants to merge 13 commits into
protosfrom
maciej/hdf5-drivers

Conversation

@MBartkowiakSTFC

Copy link
Copy Markdown
Collaborator

Description of work
As mentioned in #1162, on certain platforms (especially using cloud storage) it may be necessary to use different chunking scheme, HDF5 drivers or HDF5 cache size. This PR is meant to let the users control the HDF5 parameters so we can try out different parameter combinations easily.

Fixes

  1. Added "frames per chunk" to the "atoms per chunk" in the trajectory output.
  2. Added HDF5 cache parameters to the trajectory input widget.
  3. Suppressed ChemicalSystem creation in subprocesses.
  4. Removed ChemicalSystem from Configuration classes.

To test
All tests must pass.
Manual testing of CLI and GUI is recommended.

@MBartkowiakSTFC MBartkowiakSTFC requested a review from oerc0122 June 9, 2026 12:37

@oerc0122 oerc0122 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a few stylistic things and some failing tests.

Comment thread MDANSE/Src/MDANSE/Framework/Configurators/HDFTrajectoryConfigurator.py Outdated
Comment thread MDANSE/Src/MDANSE/Framework/Configurators/HDFTrajectoryConfigurator.py Outdated
Comment thread MDANSE/Src/MDANSE/Framework/Jobs/ScatteringLengthDensityProfile.py Outdated
Comment thread MDANSE/Src/MDANSE/Framework/Jobs/TrajectoryEditor.py Outdated
f"{self.chemical_system.name} which has "
f"{self.chemical_system.total_number_of_atoms} atoms."
)
def clone(self) -> _PeriodicConfiguration:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These probably want to be typing_extensions.Self

list[list[int]]
A list of atom index lists, one per molecule instance.
"""
return reduce(list.__add__, chemical_system.clusters.values(), [])

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more_itertools.flatten as noted above.

Comment on lines +207 to +209
grouped_indices = reduce(
list.__add__, self._chemical_system.clusters.values(), []
)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

Comment on lines +116 to +118
grouped_indices = reduce(
list.__add__, self._chemical_system.clusters.values(), []
)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again

Comment thread MDANSE_GUI/Src/MDANSE_GUI/InputWidgets/OutputTrajectoryWidget.py Outdated

for widget in self._layout.children():
if issubclass(widget, QWidget):
if isinstance(widget, QWidget):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly concerning we never caught this.

@MBartkowiakSTFC

Copy link
Copy Markdown
Collaborator Author

I am going to make minor changes to add some control over meta_block_size. The larger value made the file size increase by 10%, but also gave a big speedup for accessing the file metadata on cloud-based filesystems. We probably need an extra input for this value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants