# Syntax General form: ```python field: NDArray[Shape["{shape_expression}"], dtype] ``` ## Dtype Dtype checking is for the most part as simple as an `isinstance` check - the `dtype` attribute of the array is checked against the `dtype` provided in the `NDArray` annotation. Both numpy and builtin python types can be used. A tuple of types can also be passed: ```python field: NDArray[Shape["2, 3"], (np.int8, np.uint8)] ``` Like `nptyping`, the {mod}`~numpydantic.dtype` module provides convenient access and aliases to the common dtypes, but also provides "generic" dtypes like {class}`~numpydantic.dtype.Float` that is a tuple of all subclasses of {class}`numpy.floating`. Numpy interprets `float` as being equivalent to {class}`numpy.float64`, and {class}`numpy.floating` is an abstract parent class, so "generic" tuple dtypes fill that narrow gap. ```{todo} Future versions will support interfaces providing type maps for declaring equality between dtypes that may be specific to that library but should be considered equivalent to numpy or other library's dtypes. ``` ```{todo} Future versions will also support declaring minimum or maximum precisions, so one might say "at least a 16-bit float" and also accept a 32-bit float. ``` ## Shape ### Shape Forms The individual constraints for a shape ([below](#shape-args)) can be expressed in several forms. This is for [typechecking compatibility](typecheckers.md) and historical reasons. Our goal is to converge on a type syntax that is close to numpy's: ```python import numpy as np from typing import Any, TypeVar _T_Shape = TypeVar("_T_Shape", bound=tuple[Any, ...], default=tuple) _T_Dtype = TypeVar("_T_Dtype", bound=np.generic, default=Any) NDArray = np.ndarray[_T_Shape, np.dtype[_T_Dtype]] ``` which just treats shape as a tuple (usually `tuple[int, ...]`), and the dtype argument as a subscript of `np.dtype`. All the type forms below are valid at runtime, but only the tuple form will pass static typechecking without the mypy plugin. #### Tuple Form (Preferred in >=v2) In >v2.0, `Shape` will become an alias for `tuple`. This form is somewhat in flux as we get it nailed down, as certain typing constructs like ellipses and ranges are challenging to specify or have nonideal default behavior. The technically correct, but extremely annoying tuple form uses {class}`~typing.Literal` values for every argument: ```python from typing import Literal as L from numpydantic import NDArray, Shape # these are equivalent NDArray[Shape[L[1], L["2-3"], L["*"], L["..."]]] NDArray[tuple[L[1], L["2-3"], L["*"], ...]] NDArray[tuple[L[1], L["2-3"], int, ...]] ``` Mypy, via the [plugin](mypy-plugin), will support typechecking a more reasonable form: ```python NDArray[Shape[1, "2-3", "*", "..."]] NDArray[tuple[1, "2-3", "*", ...]] NDArray[tuple[1, "2-3", int, ...]] ``` and we will explore additional refinements as needed. #### String Form (nptyping) The pure string form is inherited from nptyping. Its use is discouraged in new code: it will be deprecated in v2.0 and removed in v3.0. The string form is syntactically invalid to the python type system, and is less inspectable than the tuple form. ```python # these are equivalent NDArray[Shape["1, 2-3, *, ..."]] NDArray[Shape[Literal["1, 2-3, *, ..."]]] ``` #### Functional Form The functional form should only be used within {func}`.NDArraySchema` or when it is otherwise the only form that satisfies static type checkers. ```python Annotated[np.ndarray, NDArraySchema(Shape(1, "2-3", "*", "..."))] ``` ### Shape Args Full documentation of nptyping's shape syntax is available in the [nptyping docs](https://github.com/ramonhagenaars/nptyping/blob/master/USERDOCS.md#Shape-expressions), but for the sake of self-contained docs, the high points are: #### Numerical Shape A comma-separated list of integers. For a 2-dimensional, 3 x 4-shaped array: ```python Shape["3, 4"] ``` #### Wildcards Wildcards indicate a dimension can be any size For a 2-dimensional, 3 x any-shaped array: ```python Shape["3, *"] ``` (shape-ranges)= #### Ranges Dimension sizes can also be specified as ranges[^rangesnote]. Ranges must have no whitespace, and may use integers or wildcards. Range specifications are **inclusive** on both ends. For an array whose... - First dimension can be of length 2, 3, or 4 - Second dimension is 2 or greater - Third dimension is 4 or less ```python Shape["2-4, 2-*, *-4"] ``` [^rangesnote]: This is an extension to nptyping's syntax, and so using `nptyping.Shape` is unsupported - use {class}`numpydantic.Shape` #### Labels Dimensions can be given labels, and in future versions these labels will be propagated to the generated JSON Schema ```python Shape["3 x, 4 y, 5 z"] ``` #### Arbitrary dimensions After some specified dimensions, one can express that there can be any number of additional dimensions with an `...` like ```python Shape["3, 4, ..."] ``` #### Any-Shaped If `dtype` is also `Any`, one can just use ```python field: NDArray ``` If a `dtype` is being passed, use the `'*'` wildcard along with the `'...'` ```python field: NDArray[Shape['*, ...'], int] ``` ## Annotated type with `NDArraySchema` ```{tip} See also: [Typechecker Integration](typecheckers) ``` If you don't need compatibility with multiple array backends, or want to have an array statically type check as a single array backend type, Use the annotated schema form with :func:`.NDArraySchema`. ```python from numpydantic import NDArraySchema class MyModel(BaseModel): field = Annotated[np.ndarray, NDArraySchema(Shape("{shape_expression}"), dtype)] ``` {func}`.NDArraySchema` also validates that the given array is of the specified type, rather than any array backend that matches the dtype and shape. ## Caveats ```{todo} numpydantic currently does not support structured dtypes or {class}`numpy.recarray` specifications like nptyping does. It will in future versions. ``` ````{todo} numpydantic also does not support the variable shape definition form like ```python Shape['Dim, Dim'] ``` where there are two dimensions of any shape as long as they are equal because at the moment it appears impossible to express dynamic constraints (ie. `minItems`/`maxItems` that depend on the shape of another array) in JSON Schema. A future minor version will allow them by generating a JSON schema with a warning that the equal shape constraint will not be represented. See: https://github.com/orgs/json-schema-org/discussions/730 ````