Syntax¶
General form:
field: NDArray[Shape["{shape_expression}"], dtype]
Type checker compatibility¶
For better compatibility with static type checkers,
rather than Shape with a string literal, you can use typing.Literal
anywhere you can use Shape.
field: NDArray[Literal["{shape_expression"], dtype]
Or, if you don’t need axis labels, you can pass the parts of a shape expression as separate args
field: NDArray[Shape[1, 2, 3], dtype]
And if your type checker complains about using a string literal as a generic, Shape can also be invoked as a callable
field: NDArray[Shape(1, 2, 3), dtype]
If you don’t need compatibility with multiple array backends,
Within pydantic models, you can use the annotated schema form with :func:.NDArraySchema
from numpydantic import NDArraySchema
class MyModel(BaseModel):
field = Annotated[np.ndarray, NDArraySchema(Shape("{shape_expression}"), dtype)]
:func:.NDArraySchema also validates that the given array is of the specified type,
rather than any array backend that matches the dtype and shape.
Dtype¶
Dtype checking is for the most part as simple as an isinstance check -
the dtype attribute of the array is checked against the dtype provided in the
NDArray annotation. Both numpy and builtin python types can be used.
A tuple of types can also be passed:
field: NDArray[Shape["2, 3"], (np.int8, np.uint8)]
Like nptyping, the dtype module provides convenient access
and aliases to the common dtypes, but also provides “generic” dtypes like
Float that is a tuple of all subclasses of
numpy.floating. Numpy interprets float as being equivalent to
numpy.float64, and numpy.floating is an abstract parent class,
so “generic” tuple dtypes fill that narrow gap.
Todo
Future versions will support interfaces providing type maps for declaring equality between dtypes that may be specific to that library but should be considered equivalent to numpy or other library’s dtypes.
Todo
Future versions will also support declaring minimum or maximum precisions, so one might say “at least a 16-bit float” and also accept a 32-bit float.
Shape¶
Full documentation of nptyping’s shape syntax is available in the nptyping docs, but for the sake of self-contained docs, the high points are:
Numerical Shape¶
A comma-separated list of integers.
For a 2-dimensional, 3 x 4-shaped array:
Shape["3, 4"]
Wildcards¶
Wildcards indicate a dimension can be any size
For a 2-dimensional, 3 x any-shaped array:
Shape["3, *"]
Ranges¶
Dimension sizes can also be specified as ranges[1]. Ranges must have no whitespace, and may use integers or wildcards. Range specifications are inclusive on both ends.
For an array whose…
First dimension can be of length 2, 3, or 4
Second dimension is 2 or greater
Third dimension is 4 or less
Shape["2-4, 2-*, *-4"]
Labels¶
Dimensions can be given labels, and in future versions these labels will be propagated to the generated JSON Schema
Shape["3 x, 4 y, 5 z"]
Arbitrary dimensions¶
After some specified dimensions, one can express that there can be any number
of additional dimensions with an ... like
Shape["3, 4, ..."]
Any-Shaped¶
If dtype is also Any, one can just use
field: NDArray
If a dtype is being passed, use the '*' wildcard along with the '...'
field: NDArray[Shape['*, ...'], int]
Caveats¶
Todo
numpydantic currently does not support structured dtypes or numpy.recarray
specifications like nptyping does. It will in future versions.
Todo
numpydantic also does not support the variable shape definition form like
Shape['Dim, Dim']
where there are two dimensions of any shape as long as they are equal
because at the moment it appears impossible to express dynamic constraints
(ie. minItems/maxItems that depend on the shape of another array)
in JSON Schema. A future minor version will allow them by generating a JSON
schema with a warning that the equal shape constraint will not be represented.
See: https://github.com/orgs/json-schema-org/discussions/730