Interfaces

Interfaces are the bridge between the abstract NDArray specification and concrete array libraries. They are subclasses of the abstract Interface class.

They contain methods for coercion, validation, serialization, and any other implementation-specific functionality.

Discovery

Interfaces are discovered through the Interface.interfaces() method - returning all subclasses of Interface. To use a custom interface, it just needs to be defined/imported by the time you intend to use it when instantiating a pydantic model.

Each interface implements a Interface.enabled() method that determines whether that interface can be used. Typically that means checking if its dependencies are present in the environment, but can also control conditional use.

Matching

When a pydantic model is instantiated and an NDArray is to be validated, Interface.match() first, uh, finds the matching interface.

Each interface must define a Interface.check() class that accepts the array to be validated and returns whether it can be used. Interfaces can have any checking logic they want, and so can eg. determine if a path is a particular type of file, but should return quickly and do little work since they are called frequently.

Validation fails if an argument doesn’t match any interface.

Note

The NumpyInterface is special cased and is only checked if no other interface matches. It attempts to cast the input argument to a numpy.ndarray to see if it is arraylike, and since many lazy-loaded array libraries will attempt to load the whole array into memory when cast to an ndarray, we only try as a last resort.

Validation

Validation is a chain of lifecycle methods, with a single argument passed and returned to and from each:

Interface.validate() calls in order:

The before and after methods provide hooks for coercion, loading, etc. such that validate can accept one of the types in the interface’s input_types and return the return_type .

Diagram

Todo

Sorry this is unreadable, need to recall how to change the theme for generated mermaid diagrams but it is very late and i want to push this.

flowchart LR classDef data fill:#2b8cee,color:#ffffff; classDef X fill:transparent,border:none,color:#ff0000; input subgraph Interface match end subgraph Numpy numpy_check["check"] end subgraph Dask direction TB dask_check["check"] subgraph Validation direction TB before_validation --> validate_dtype validate_dtype --> validate_shape validate_shape --> after_validation end dask_check --> Validation end subgraph Zarr zarr_check["check"] end subgraph Model output end zarr_x["X"] numpy_x["X"] input --> match match --> numpy_check match --> zarr_check match --> Dask zarr_check --> zarr_x numpy_check --> numpy_x Validation --> Model class input data class output data class zarr_x X class numpy_x X