unumpy
¶
Note
This page describes the overall philosophy behind unumpy
. If you are
interested in a general dispatch mechanism, see uarray
.
unumpy
builds on top of uarray
. It is an effort to specify the core
NumPy API, and provide backends for the API.
What’s new in unumpy
?¶
unumpy
is the first approach to leverage uarray
in order to build a
generic backend system for (what we hope will be) the core NumPy API specification.
It’s possible to create the backend object, and use that to perform operations.
In addition, it’s possible to change the used backend via a context manager.
Relation to the NumPy duck-array ecosystem¶
There are three main NumPy enhancement proposals (NEPs) inside NumPy itself that relate
to the duck-array ecosystem. There is NEP-22,
which is a high-level overview of the duck-array ecosystem, and the direction NumPy
intends to move towards. Two main protocols were introduced to fill this gap,
the __array_function__
protocol defined in NEP-18,
and the older __array_ufunc__
protocol defined in NEP-13.
unumpy
provides an an alternate framework based on uarray
, bypassing
the __array_function__
and __array_ufunc__
protocols entirely. It
provides a clear separation of concerns. It defines callables which can be overridden,
and expresses everything else in terms of these callables. See the uarray
documentation for more details.
unumpy¶
Note
If you are interested in writing backends or multimethods for unumpy
,
please look at the documentation for uarray
, which explains how to
do this.
unumpy
is meant for three groups of individuals:
- Those who write array-like objects, like developers of Dask, Xnd, PyData/Sparse, CuPy, and others.
- Library authors or programmers that hope to target multiple array backends, listed above.
- Users who wish to target their code to other backends.
For example, the following is currently possible:
>>> import uarray as ua
>>> import unumpy as np
>>> import unumpy.dask_backend as dask_backend
>>> import unumpy.sparse_backend as sparse_backend
>>> import sparse, dask.array as da
>>> def main():
... x = np.zeros(5)
... return np.exp(x)
>>> with ua.set_backend(dask_backend):
... isinstance(main(), da.core.Array)
True
>>> with ua.set_backend(sparse_backend):
... isinstance(main(), sparse.SparseArray)
True
Now imagine some arbitrarily nested code, all for which the implementations can be switched out using a simple context manager.
unumpy
is an in-progress mirror of the NumPy API which allows the user
to dynamically switch out the backend that is used. It also allows
auto-selection of the backend based on the arguments passed into a function. It does this by
defining a collection of uarray
multimethods that support dispatch.
Although it currently provides a number of backends, the aspiration is that,
with time, these back-ends will move into the respective libraries and it will be possible
to use the library modules directly as backends.
Note that currently, our coverage is very incomplete. However, we have attempted
to provide at least one of each kind of object in unumpy
for
reference. There are ufunc
s and ndarray
s, which are classes,
methods on ufunc
such as __call__
, and
reduce
and also functions such as sum
.
Where possible, we attempt to provide default implementations so that the whole API does not have to be reimplemented, however, it might be useful to gain speed or to re-implement it in terms of other functions which already exist in your library.
The idea is that once things are more mature, it will be possible to switch out your backend with a simple import statement switch:
import numpy as np # Old method
import unumpy as np # Once this project is mature
Currently, the following functions are supported:
- All NumPy universal functions.
For the full range of functions, use dir(unumpy)
.
You can use the uarray.set_backend
decorator to set a backend and use the
desired backend. Note that not every backend supports every method. For example,
PyTorch does not have an exact ufunc
equivalent, so we dispatch to actual
methods using a dictionary lookup. The following
backends are supported:
numpy_backend
torch_backend
xnd_backend
dask_backend
cupy_backend
sparse_backend
Writing Backends¶
Since unumpy
is based on uarray
, all overrides are done via the __ua_*__
protocols. We strongly recommend you read the
uarray documentation for context.
All functions/methods in unumpy
are uarray
multimethods. This means
you can override them using the __ua_function__
protocol.
In addition, unumpy
allows dispatch on numpy.ndarray
,
numpy.ufunc
and numpy.dtype
via the __ua_convert__
protocol.
Dispatching on objects means one can intercept these, convert to an equivalent
native format, or dispatch on their methods, including __call__
.
We suggest you browse the source for example backends.
Differences between overriding numpy.ufunc
objects and other multimethods¶
Of note here is that there are certain callable objects within NumPy, most
prominently numpy.ufunc
objects, that are not typical functions/methods,
and so cannot be directly overridden, the key word here being directly.
In Python, when a method is called, i.e. x.method(*a, **kw)
it is the same
as writing type(x).method(x, *a, **kw)
assuming that method
was a regular
method defined on the type. This allows some very interesting things to happen.
For instance, if we make method
a multimethod, it allows us to override
methods, provided we know that the first argument passed in will be x
.
One other thing that is possible (and done in unumpy
) is to override the
__call__
method on a callable object. This is, in fact, exactly how to override
a ufunc
.
Other interesting things that can be done (but as of now, are not) are to replace
ufunc
objects entirely by native equivalents overriding the __get__
method.
This technique can also be applied to dtype
objects.
Meta-array support¶
Meta-arrays are arrays that can hold other arrays, such as Dask arrays and XArray datasets.
If meta-arrays and libraries depend on unumpy
instead of NumPy, they can benefit
from containerization and hold arbitrary arrays; not just numpy.ndarray
objects.
Inside their __ua_function__
implementation, they might need to do something like the
following:
>>> class Backend: pass
>>> meta_backend = Backend()
>>> meta_backend.__ua_domain__ = "numpy"
>>> def ua_func(f, a, kw):
... # We do this to avoid infinite recursion
... with ua.skip_backend(meta_backend):
... # Actual implementation here
... pass
>>> meta_backend.__ua_function__ = ua_func
In this form, one could do something like the following to use the meta-backend:
>>> with ua.set_backend(sparse_backend), ua.set_backend(dask_backend):
... x = np.zeros((2000, 2000))
... isinstance(x, da.Array)
... isinstance(x.compute(), sparse.SparseArray)
True
True