HDF5

Interfaces for HDF5 Datasets

Note

HDF5 arrays are accessed through a proxy class H5Proxy . Getting/setting values should work as normal, except that setting values on nested views is impossible -

Specifically this doesn’t work:

my_model.array[0][0] = 1

But this does work:

my_model.array[0,0] = 1

To have direct access to the hdf5 dataset, use the H5Proxy.open() method.

class H5ArrayPath(file: Path | str, path: str, field: str | List[str] | None = None)[source]

Location specifier for arrays within an HDF5 file

Create new instance of H5ArrayPath(file, path, field)

file: Path | str

Location of HDF5 file

path: str

Path within the HDF5 file

field: str | List[str] | None

Refer to a specific field within a compound dtype

class H5Proxy(file: Path | str, path: str, field: str | List[str] | None = None)[source]

Proxy class to mimic numpy-like array behavior with an HDF5 array

The attribute and item access methods only open the file for the duration of the method, making it less perilous to share this object between threads and processes.

This class attempts to be a passthrough class to a h5py.Dataset object, including its attributes and item getters/setters.

When using read-only methods, no locking is attempted (beyond the HDF5 defaults), but when using the write methods (setting an array value), try and use the locking methods of h5py.File .

Parameters:
  • file (pathlib.Path | str) – Location of hdf5 file on filesystem

  • path (str) – Path to array within hdf5 file

  • field (str, list[str]) – Optional - refer to a specific field within a compound dtype

array_exists() bool[source]

Check that there is in fact an array at path within file

classmethod from_h5array(h5array: H5ArrayPath) H5Proxy[source]

Instantiate using H5ArrayPath

property dtype: dtype

Get dtype of array, using field if present

__len__() int[source]

self.shape[0]

open(mode: str = 'r') Dataset[source]

Return the opened h5py.Dataset object

You must remember to close the associated file with close()

close() None[source]

Close the h5py.File object left open when returning the dataset with open()

class H5Interface(shape: Tuple[int, ...] | Any, dtype: str | type | Any | generic)[source]

Interface for Arrays stored as datasets within an HDF5 file.

Takes a H5ArrayPath specifier to select a h5py.Dataset from a h5py.File and returns a H5Proxy class that acts like a passthrough numpy-like interface to the dataset.

return_type

alias of H5Proxy

classmethod enabled() bool[source]

Check whether h5py can be imported

classmethod check(array: H5ArrayPath | Tuple[Path | str, str]) bool[source]

Check that the given array is a H5ArrayPath or something that resembles one.

before_validation(array: Any) NDArrayType[source]

Create an H5Proxy to use throughout validation

get_dtype(array: NDArrayType) str | type | Any | generic[source]

Get the dtype from the input array

Subclasses to correctly handle

classmethod to_json(array: H5Proxy, info: SerializationInfo | None = None) dict[source]

Dump to a dictionary containing

  • file: file

  • path: path

  • attrs: Any HDF5 attributes on the dataset

  • array: The array as a list of lists