aiaccel.torch.h5py package#

Submodules#

aiaccel.torch.h5py.hdf5_writer module#

class aiaccel.torch.h5py.hdf5_writer.HDF5Writer[source]#

Bases: Generic[T1, T2]

Abstract base class for writing data to an HDF5 file.

This class provides methods to write data into HDF5 format, supporting both single-process and parallel (MPI-based) writing. Subclasses must implement prepare_globals and prepare_group to define how data is structured.

Typical usage is supposed to be:

class FooHDF5Writer(HDF5Writer):
    def prepare_globals(self):
        item_list = list(range(100))

        offset = 10
        maximum = 50

        return item_list, (offset, maximum)

    def prepare_group(self, item, context):
        offset, maximum = context

        group_name = f"{item:04d}

        return {group_name: {"data": np.full([10, 10], offset + item).clip(maximum)}}

writer = FooHDF5Writer()
writer.write("test.hdf5", parallel=False)

h5: File#

abstractmethod prepare_globals() → tuple[list[T1], T2][source]#

Prepare the global data required for writing.

This method must be implemented by subclasses to provide the data items and any necessary context for processing.

Returns:: A tuple containing a list of data items and context information.
Return type:: tuple[list[T1], T2]

abstractmethod prepare_group(item: T1, context: T2) → dict[str, dict[str, ndarray[tuple[int, ...], dtype[Any]]]][source]#

Prepare groups of datasets for writing to HDF5.

This method must be implemented by subclasses to define how individual data items should be structured within the HDF5 file.

Parameters:

item (T1) – A single data item.
context (T2) – Additional context for processing.

Returns:

A dictionary mapping group names to dataset dictionaries.

Return type:

dict[str, dict[str, npt.NDArray[Any]]]

write(filename: Path, parallel: bool = False) → None[source]#

Write data to an HDF5 file, optionally using parallel processing.

Parameters:

filename (Path) – Path to the output HDF5 file.
parallel (bool, optional) – Whether to use parallel writing. Defaults to False.

Module contents#

class aiaccel.torch.h5py.HDF5Writer[source]#

Bases: Generic[T1, T2]

Abstract base class for writing data to an HDF5 file.

This class provides methods to write data into HDF5 format, supporting both single-process and parallel (MPI-based) writing. Subclasses must implement prepare_globals and prepare_group to define how data is structured.

Typical usage is supposed to be:

class FooHDF5Writer(HDF5Writer):
    def prepare_globals(self):
        item_list = list(range(100))

        offset = 10
        maximum = 50

        return item_list, (offset, maximum)

    def prepare_group(self, item, context):
        offset, maximum = context

        group_name = f"{item:04d}

        return {group_name: {"data": np.full([10, 10], offset + item).clip(maximum)}}

writer = FooHDF5Writer()
writer.write("test.hdf5", parallel=False)

h5: File#

abstractmethod prepare_globals() → tuple[list[T1], T2][source]#

Prepare the global data required for writing.

This method must be implemented by subclasses to provide the data items and any necessary context for processing.

Returns:: A tuple containing a list of data items and context information.
Return type:: tuple[list[T1], T2]

abstractmethod prepare_group(item: T1, context: T2) → dict[str, dict[str, ndarray[tuple[int, ...], dtype[Any]]]][source]#

Prepare groups of datasets for writing to HDF5.

This method must be implemented by subclasses to define how individual data items should be structured within the HDF5 file.

Parameters:

item (T1) – A single data item.
context (T2) – Additional context for processing.

Returns:

A dictionary mapping group names to dataset dictionaries.

Return type:

dict[str, dict[str, npt.NDArray[Any]]]

write(filename: Path, parallel: bool = False) → None[source]#

Write data to an HDF5 file, optionally using parallel processing.

Parameters:

filename (Path) – Path to the output HDF5 file.
parallel (bool, optional) – Whether to use parallel writing. Defaults to False.