HDF5¶
Interfaces for HDF5 Datasets
Note
HDF5 arrays are accessed through a proxy class H5Proxy .
Getting/setting values should work as normal, except that setting
values on nested views is impossible -
Specifically this doesn’t work:
my_model.array[0][0] = 1
But this does work:
my_model.array[0,0] = 1
To have direct access to the hdf5 dataset, use the
H5Proxy.open() method.
Datetimes¶
Datetimes are supported as a dtype annotation, but currently they must be stored
as S32 isoformatted byte strings (timezones optional) like:
import h5py
from datetime import datetime
import numpy as np
data = np.array([datetime.now().isoformat().encode('utf-8')], dtype="S32")
h5f = h5py.File('test.hdf5', 'w')
h5f.create_dataset('data', data=data)
- class H5ArrayPath(file: Path | str, path: str, field: str | List[str] | None = None)[source]¶
Location specifier for arrays within an HDF5 file
Create new instance of H5ArrayPath(file, path, field)
- pydantic model H5JsonDict[source]¶
Round-trip Json-able version of an HDF5 dataset
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Fields:
- to_array_input() H5ArrayPath[source]¶
Construct an
H5ArrayPath
- class H5Proxy(file: Path | str, path: str, field: str | List[str] | None = None, annotation_dtype: str | type | Any | generic | None = None)[source]¶
Proxy class to mimic numpy-like array behavior with an HDF5 array
The attribute and item access methods only open the file for the duration of the method, making it less perilous to share this object between threads and processes.
This class attempts to be a passthrough class to a
h5py.Datasetobject, including its attributes and item getters/setters.When using read-only methods, no locking is attempted (beyond the HDF5 defaults), but when using the write methods (setting an array value), try and use the
lockingmethods ofh5py.File.- Parameters:
- classmethod from_h5array(h5array: H5ArrayPath) H5Proxy[source]¶
Instantiate using
H5ArrayPath
- open(mode: str = 'r') Dataset[source]¶
Return the opened
h5py.DatasetobjectYou must remember to close the associated file with
close()
- class H5Interface(shape: Tuple[int, ...] | Any = typing.Any, dtype: str | type | Any | generic = typing.Any)[source]¶
Interface for Arrays stored as datasets within an HDF5 file.
Takes a
H5ArrayPathspecifier to select ah5py.Datasetfrom ah5py.Fileand returns aH5Proxyclass that acts like a passthrough numpy-like interface to the dataset.- json_model¶
alias of
H5JsonDict
- classmethod check(array: H5ArrayPath | Tuple[Path | str, str]) bool[source]¶
Check that the given array is a
H5ArrayPathor something that resembles one.
- before_validation(array: Any) NDArrayType[source]¶
Create an
H5Proxyto use throughout validation
- get_dtype(array: NDArrayType) str | type | Any | generic[source]¶
Get the dtype from the input array
Subclasses to correctly handle