Typechecker Integration

Numpydantic does things with the python type system that are not formally supported, it tries to have permissive behavior by default for typecheckers, but provides plugins for strict static type checking.

Mypy Plugin

Added in version 1.9.0.

Experimental!

The mypy plugin is experimental! Please raise issues describing any bugs or shortcomings! Input is very welcome on what would be useful here.

Numpydantic provides a mypy plugin that is capable of:

  • Checking NDArray-NDArray annotation compatibility with shapes and dtypes

  • Checking numpy.ndarray-NDArray annotations: blank np.ndarray annotations are treated as NDArray[Any, Any]

  • Inferring array shapes and dtypes from constructors like numpy.ones() for any supported interfaces when arrays are constructed with integer literals.

Type checking works across all typing comparisons: directly annotated assignments, function/method parameters and return types, and of course pydantic fields! (see below)

This makes it possible to escape some of the ambiguity of passing np.ndarray types around with explicit annotations!

E.g. say you have some analysis function that only accepts grayscale images:

 1import numpy as np
 2
 3from numpydantic import NDArray, Shape
 4
 5GRAYSCALE = NDArray[Shape["* x, * y"], np.uint8]
 6RGB = NDArray[Shape["* x, * y, 3 rgb"], np.uint8]
 7
 8
 9def read_rgb() -> RGB:
10    return np.ones((1920, 1080, 3), dtype=np.uint8)
11
12
13def read_grayscale() -> GRAYSCALE:
14    return np.ones((1920, 1080), dtype=np.uint8)
15
16
17def grayscale_mask(frame: GRAYSCALE) -> GRAYSCALE:
18    # Probably something fancier than this...
19    mask = np.zeros((frame.shape[0], frame.shape[1]), np.uint8)
20    mask[frame > 5] = 1
21    return mask
22
23
24# this works
25grayscale_mask(read_grayscale())
26
27# this doesn't
28grayscale_mask(read_rgb())

Find bugs before they’re deployed with your type checker!

examples/incorrect/rgb_gray_frame.py:28: error: Argument 1 to "grayscale_mask"
has incompatible type
"ndarray[tuple[int, int, Literal[3]], dtype[unsignedinteger[_8Bit]]]"; expected
"ndarray[tuple[int, int], dtype[unsignedinteger[_8Bit]]]"  [arg-type]
    grayscale_mask(read_rgb())
                   ^~~~~~~~~~
Found 1 error in 1 file (checked 1 source file)

Configuration

Enable the mypy plugin in your pyproject.toml configuration

[tool.mypy]
plugins = [
  "numpydantic.mypy",
]

And configure it with with the tool.numpydantic.mypy table

[tool.numpydantic.mypy]
interfaces = [
  "numpy",
  "dask",
  "zarr",
]

If you are using numpydantic with pydantic, you should also enable pydantic’s mypy plugin. By default pydantic uses Any types for all the fields in its synthesized __init__ method, so static checking for array types doesn’t work. Numpydantic’s plugin handles its input type transformations correctly (see below), so you likely want to set its init_typed option to True.

If you are using numpydantic with zarr or any other array interface whose package is untyped, you will need to enable mypy’s follow-untyped-imports option.

A full configuration might then look like this:

[tool.mypy]
plugins = [
  "numpydantic.mypy",
  "pydantic.mypy",
]

[[tool.mypy.overrides]]
module = ["zarr.*"]
follow_untyped_imports = true

[tool.numpydantic.mypy]
interfaces = [
  "numpy",
  "dask",
  "zarr",
]

[tool.pydantic-mypy]
init_typed = true

Configuration Reference

class MypyPluginOptions(interfaces: list[str] = <factory>)[source]

Configure the mypy plugin.

Set options in the [tool.numpydantic.mypy] table

interfaces: list[str]

A list of interface names that should have their constructor return types enriched (and replaced with np.ndarray types).

Numpy constructors are always enriched (to disable them, disable the plugin).

Shape Checking

Scalars

Scalar shapes are checked as expected, where the shapes must match exactly.

Correct

 6def make_array() -> NDArray[Shape[1, 2, 3], np.uint8]:
 7    return np.ones((1, 2, 3), dtype=np.uint8)
 8
 9
10x: NDArray[Shape[1, 2, 3], np.uint8] = make_array()
Success: no issues found in 1 source file

Incorrect

 6def make_array() -> NDArray[Shape[2, 3, 4], np.uint8]:
 7    return np.ones((1, 2, 3), dtype=np.uint8)
 8
 9
10x: NDArray[Shape[5, 6, 7], np.uint8] = make_array()
examples/incorrect/shape_scalar.py:7: error: Incompatible return value type
(got "ndarray[int, dtype[unsignedinteger[_8Bit]]]", expected
"ndarray[tuple[Literal[2], Literal[3], Literal[4]], dtype[unsignedinteger[_8Bit]]]")
 [return-value]
        return np.ones((1, 2, 3), dtype=np.uint8)
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
examples/incorrect/shape_scalar.py:10: error: Incompatible types in assignment
(expression has type
"ndarray[tuple[Literal[2], Literal[3], Literal[4]], dtype[unsignedinteger[_8Bit]]]",
variable has type
"ndarray[tuple[Literal[5], Literal[6], Literal[7]], dtype[unsignedinteger[_8Bit]]]")
 [assignment]
    x: NDArray[Shape[5, 6, 7], np.uint8] = make_array()
                                           ^~~~~~~~~~~~
Found 2 errors in 1 file (checked 1 source file)

Ranges

Ranges work on the left hand side with a right hand side scalar (see known limitations). When possible, you should attempt to keep right-hand/return annotations scalar for the strongest typing, even if left-hand annotations can accept shape ranges.

Correct

 6def make_array() -> NDArray[Shape[1, 2, 3], np.uint8]:
 7    return np.ones((1, 2, 3), dtype=np.uint8)
 8
 9
10x: NDArray[Shape["1-3, 1-4, 2-5"], np.uint8] = make_array()
Success: no issues found in 1 source file

Incorrect

 6def make_array() -> NDArray[Shape[1, 2, 3], np.uint8]:
 7    return np.ones((1, 2, 3), dtype=np.uint8)
 8
 9
10x: NDArray[Shape["1-3, 1-4, 10-20"], np.uint8] = make_array()
examples/incorrect/shape_range_scalar.py:10: error: Incompatible types in
assignment (expression has type
"ndarray[tuple[Literal[1], Literal[2], Literal[3]], dtype[unsignedinteger[_8Bit]]]",
variable has type
"ndarray[tuple[Literal['1-3'], Literal['1-4'], Literal['10-20']], dtype[unsignedinteger[_8Bit]]]")
 [assignment]
    x: NDArray[Shape["1-3, 1-4, 10-20"], np.uint8] = make_array()
                                                     ^~~~~~~~~~~~
Found 1 error in 1 file (checked 1 source file)

Wildcards & Ellipses

As during runtime, wildcards specify that an array dimension must exist but can be any size, and ellipses specify that any number of additional dimensions may be included.

Correct

 7AT_LEAST_2D = NDArray[Shape["*, *, ..."]]
 8
 9# two or more dimensions!
10x: AT_LEAST_2D = np.ones((2, 3))
11y: AT_LEAST_2D = np.ones((2, 3, 4))
12z_pre: AT_LEAST_2D = np.ones((2, 3, 4, 5))
13z = cast(NDArray[Shape[2, 3, 4, 5]], z_pre)
14
15# The LHS types will lose their constructor enrichment
16reveal_type(x)
17reveal_type(y)
18
19# casting can rescue it
20reveal_type(z)
examples/correct/shape_wildcard.py:16: note: Revealed type is "numpy.ndarray[tuple[int, int, *tuple[int, ...]], numpy.dtype[Any]]"
examples/correct/shape_wildcard.py:17: note: Revealed type is "numpy.ndarray[tuple[int, int, *tuple[int, ...]], numpy.dtype[Any]]"
examples/correct/shape_wildcard.py:20: note: Revealed type is "numpy.ndarray[tuple[Literal[2], Literal[3], Literal[4], Literal[5]], numpy.dtype[Any]]"
Success: no issues found in 1 source file

Incorrect

 7AT_LEAST_2D = NDArray[Shape["*, *, ..."]]
 8
 9
10def returns_atleast2d() -> AT_LEAST_2D:
11    return np.ones((2, 3, 4))
12
13
14# two or more dimensions!
15x: AT_LEAST_2D = np.ones((1,))
16
17# wildcards can't be narrowed without cast,
18# even if the narrowing is within the range
19# (since the rhs could be a shape that doesn't fit the narrowing)
20y: NDArray[Shape[2, 3, 4]] = returns_atleast2d()
21
22reveal_type(x)
23reveal_type(y)
examples/incorrect/shape_wildcard.py:15: error: Incompatible types in
assignment (expression has type "ndarray[int, dtype[Any]]", variable has type
"ndarray[tuple[int, int, *tuple[int, ...]], dtype[Any]]")  [assignment]
    x: AT_LEAST_2D = np.ones((1,))
                     ^~~~~~~~~~~~~
examples/incorrect/shape_wildcard.py:20: error: Incompatible types in
assignment (expression has type
"ndarray[tuple[int, int, *tuple[int, ...]], dtype[Any]]", variable has type
"ndarray[tuple[Literal[2], Literal[3], Literal[4]], dtype[Any]]")  [assignment]
    y: NDArray[Shape[2, 3, 4]] = returns_atleast2d()
                                 ^~~~~~~~~~~~~~~~~~~
examples/incorrect/shape_wildcard.py:22: note: Revealed type is "numpy.ndarray[tuple[int, int, *tuple[int, ...]], numpy.dtype[Any]]"
examples/incorrect/shape_wildcard.py:23: note: Revealed type is "numpy.ndarray[tuple[Literal[2], Literal[3], Literal[4]], numpy.dtype[Any]]"
Found 2 errors in 1 file (checked 1 source file)

Dtype Checking

Correct

 5from numpydantic import NDArray, dtype
 6
 7# literal dtypes
 8w: NDArray[Any, np.uint8] = np.ones((1, 2, 3), dtype=np.uint8)
 9
10# builtin aliases for their numpy equivalents work
11x: NDArray[Any, float] = np.ones((1, 2, 3), dtype=np.float64)
12
13# Unions too
14y: NDArray[Any, np.uint8 | np.uint16] = np.ones((1, 2, 3), dtype=np.uint8)
15
16# And numpydantic's alias types
17z: NDArray[Any, dtype.Integer] = np.ones((1, 2, 3), dtype=np.uint8)
18
19reveal_type(w)
20reveal_type(x)
21reveal_type(y)
22reveal_type(z)
examples/correct/dtype_basic.py:19: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.unsignedinteger[numpy._typing._nbit_base._8Bit]]]"
examples/correct/dtype_basic.py:20: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.float64]]"
examples/correct/dtype_basic.py:21: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.unsignedinteger[numpy._typing._nbit_base._8Bit] | numpy.unsignedinteger[numpy._typing._nbit_base._16Bit]]]"
examples/correct/dtype_basic.py:22: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.signedinteger[numpy._typing._nbit_base._8Bit] | numpy.signedinteger[numpy._typing._nbit_base._16Bit] | numpy.signedinteger[numpy._typing._nbit_base._32Bit] | numpy.signedinteger[numpy._typing._nbit_base._64Bit] | numpy.signedinteger[numpy._typing._nbit_base._16Bit] | numpy.unsignedinteger[numpy._typing._nbit_base._8Bit] | numpy.unsignedinteger[numpy._typing._nbit_base._16Bit] | numpy.unsignedinteger[numpy._typing._nbit_base._32Bit] | numpy.unsignedinteger[numpy._typing._nbit_base._64Bit] | numpy.unsignedinteger[numpy._typing._nbit_base._16Bit]]]"
Success: no issues found in 1 source file

Incorrect

 5from numpydantic import NDArray, dtype
 6
 7# incorrect literal dtypes
 8w: NDArray[Any, np.uint16] = np.ones((1, 2, 3), dtype=np.uint8)
 9
10# builtin aliases for their numpy equivalents
11x: NDArray[Any, int] = np.ones((1, 2, 3), dtype=np.float64)
12
13# Unions too
14y: NDArray[Any, np.uint8 | np.uint16] = np.ones((1, 2, 3), dtype=np.float64)
15
16# And numpydantic's alias types
17z: NDArray[Any, dtype.Integer] = np.ones((1, 2, 3), dtype=np.float64)
examples/incorrect/dtype_basic.py:8: error: Incompatible types in assignment
(expression has type "ndarray[int, dtype[unsignedinteger[_8Bit]]]", variable has
type "ndarray[Any, dtype[unsignedinteger[_16Bit]]]")  [assignment]
    w: NDArray[Any, np.uint16] = np.ones((1, 2, 3), dtype=np.uint8)
                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
examples/incorrect/dtype_basic.py:11: error: Incompatible types in assignment
(expression has type "ndarray[int, dtype[float64]]", variable has type
"ndarray[Any, dtype[signedinteger[_32Bit | _64Bit]]]")  [assignment]
    x: NDArray[Any, int] = np.ones((1, 2, 3), dtype=np.float64)
                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
examples/incorrect/dtype_basic.py:14: error: Incompatible types in assignment
(expression has type "ndarray[int, dtype[float64]]", variable has type
"ndarray[Any, dtype[unsignedinteger[_8Bit] | unsignedinteger[_16Bit]]]") 
[assignment]
    y: NDArray[Any, np.uint8 | np.uint16] = np.ones((1, 2, 3), dtype=np.fl...
                                            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~...
examples/incorrect/dtype_basic.py:17: error: Incompatible types in assignment
(expression has type "ndarray[int, dtype[float64]]", variable has type
"ndarray[Any, dtype[signedinteger[_8Bit] | signedinteger[_16Bit] | signedinteger[_32Bit] | signedinteger[_64Bit] | unsignedinteger[_8Bit] | unsignedinteger[_16Bit] | unsignedinteger[_32Bit] | unsignedinteger[_64Bit]]]")
 [assignment]
    z: NDArray[Any, dtype.Integer] = np.ones((1, 2, 3), dtype=np.float64)
                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Found 4 errors in 1 file (checked 1 source file)

Constructor Inference

Constructor inference works by modifying the type returned from supported array constructors:

x = np.ones((1, 2, 3), dtype=np.float32)
reveal_type(x)

Without the plugin

examples/correct/numpy_inference.py:6: note: Revealed type is "numpy.ndarray[tuple[int, int, int], numpy.dtype[numpy.floating[numpy._typing._nbit_base._32Bit]]]"
Success: no issues found in 1 source file

With the plugin

examples/correct/numpy_inference.py:6: note: Revealed type is "numpy.ndarray[tuple[Literal[1], Literal[2], Literal[3], fallback=int], numpy.dtype[numpy.floating[numpy._typing._nbit_base._32Bit]]]"
Success: no issues found in 1 source file

Each interface may support constructor inference by declaring a InterfaceTyping class with a set of ConstructorSpec objects. You can see the currently supported constructors on the relevant interface pages (e.g. for numpy, zarr).

The ConstructorSpecs declare how to locate the shape and dtype args or kwargs, (these are almost always shape in the first positional arg and the dtype specified as a kwarg, but why assume!)

So, if enabled, return values for non-numpy interfaces can declare how to infer their shapes and dtypes:

from typing import reveal_type

import dask.array as da
import numpy as np
import zarr

x = np.zeros((3, 4, 5), dtype=np.uint8)
y = da.zeros((3, 4, 5), dtype=np.uint8)
z = zarr.zeros((3, 4, 5), another=int, dtype=np.uint8)

reveal_type(x)
reveal_type(y)
reveal_type(z)

Without the plugin

examples/correct/interface_inference.py:11: note: Revealed type is "numpy.ndarray[tuple[int, int, int], numpy.dtype[numpy.unsignedinteger[numpy._typing._nbit_base._8Bit]]]"
examples/correct/interface_inference.py:12: note: Revealed type is "Any"
examples/correct/interface_inference.py:13: note: Revealed type is "Any"
Success: no issues found in 1 source file

With the plugin

examples/correct/interface_inference.py:11: note: Revealed type is "numpy.ndarray[tuple[Literal[3], Literal[4], Literal[5], fallback=int], numpy.dtype[numpy.unsignedinteger[numpy._typing._nbit_base._8Bit]]]"
examples/correct/interface_inference.py:12: note: Revealed type is "Any"
examples/correct/interface_inference.py:13: note: Revealed type is "numpy.ndarray[tuple[Literal[3], Literal[4], Literal[5], fallback=int], numpy.dtype[numpy.unsignedinteger[numpy._typing._nbit_base._8Bit]]]"
Success: no issues found in 1 source file

Dask is broken

Note that dask’s constructor inference doesn’t work at the moment. This is due to dask’s array creation routines being positively haunted, an untyped wrapped dynamic construction of a function that creates a class that creates a class.

PRs welcome re: figuring out how to type that.

Todo

This is less than optimal. Even though the inferred types from the constructors are Any without the plugin, and so being typed with numpy methods is better than nothing, we are missing the backend-specific types.

In the future we will be extending the mypy plugin to understand the NDArraySchema()-style annotated types, PRs welcome!

from typing import Annotated as A
import zarr

from numpydantic import NDArraySchema, Shape

def make_zarr() -> A[zarr.Array, NDArraySchema(Shape(1, 2, 3))]:
    return zarr.zeros((1,2,3))

Pydantic Models

When used as a type on a pydantic model, numpydantic is able to coerce convenience input types into arrays. This means that we should consider some decidedly non-array inputs as satisfying an array type - like paths and strings - which makes sense for pydantic models, but would be bad to accept as a function param.

The mypy plugin can detect when the annotation is being used within a pydantic model, and allows the items within the enabled interfaces input_types() list, and otherwise refuses them.

When checking non-array input types like a path, shape and dtype checking is unsupported, as it would be an absolutely absurd thing to do to open on-disk array stores or analyze videos while type checking.

Correct

 7class MyModel(BaseModel):
 8    array: NDArray[Shape[1, 2, 3]]
 9
10
11instance = MyModel(array=H5ArrayPath("./example.h5", "/some/dataset"))
Success: no issues found in 1 source file

Incorrect

 7class MyModel(BaseModel):
 8    array: NDArray[Shape[1, 2, 3]]
 9
10
11# you can't just use any old thing
12x = MyModel(array="./example.h5")
13y = MyModel(array=5)
14
15
16# and the annotation refuses to accept non-array inputs
17# when used in non-pydantic contexts
18def passthrough(array: NDArray[Shape[1, 2, 3]]) -> NDArray[Shape[1, 2, 3]]:
19    return array
20
21
22z = passthrough(H5ArrayPath("./example.h5", "/some/dataset"))
examples/incorrect/pydantic_field.py:12: error: Argument "array" to "MyModel"
has incompatible type "str"; expected
"ndarray[tuple[Literal[1], Literal[2], Literal[3]], dtype[Any]] | ZarrArrayPath | VideoProxy | zarr.core.Array | H5ArrayPath | dask.array.core.Array | Path | VideoCapture | H5Proxy"
 [arg-type]
    x = MyModel(array="./example.h5")
                      ^~~~~~~~~~~~~~
examples/incorrect/pydantic_field.py:13: error: Argument "array" to "MyModel"
has incompatible type "int"; expected
"ndarray[tuple[Literal[1], Literal[2], Literal[3]], dtype[Any]] | ZarrArrayPath | VideoProxy | zarr.core.Array | H5ArrayPath | dask.array.core.Array | Path | VideoCapture | H5Proxy"
 [arg-type]
    y = MyModel(array=5)
                      ^
examples/incorrect/pydantic_field.py:22: error: Argument 1 to "passthrough" has
incompatible type "H5ArrayPath"; expected
"ndarray[tuple[Literal[1], Literal[2], Literal[3]], dtype[Any]]"  [arg-type]
    z = passthrough(H5ArrayPath("./example.h5", "/some/dataset"))
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Found 3 errors in 1 file (checked 1 source file)

Known Limitations

Range-range checking

Our implementation of ranges is somewhat cursed, and it is currently impossible to check if a right-hand side range is contained within a left-hand range. This is because we check ranges against literals using __eq__ which is commutative.

E.g. the following should fail, but it does not:

import numpy as np

from numpydantic import NDArray, Shape


def make_range() -> NDArray[Shape["3-5, 3-5"]]:
    return np.ones((4, 4))


x: NDArray[Shape["6-8, 6-8"]] = make_range()
Success: no issues found in 1 source file

Pull requests welcome!