Skip to content

Upstream uarray integration #241

Closed
Closed
@eric-czech

Description

@eric-czech

I noticed a thread between @hameerabbasi and some Xarray folks in pydata/xarray#1938 (comment) and was curious if you guys would be willing to talk a little bit about the state of uarray integration in other PyData projects. It looks like this was all pretty nascent then and I was disappointed to see that that there aren't any more open issues about working it into Xarray (or so it seems).

I like the idea a lot and I'm trying to understand how to best think about it's usage by scikit developers. More specifically, we work in statistical genetics and are coordinating with a few other developers in the space to help think about the next generation of tools like scikit-allel. A big question for me in the context of uarray is how the Backends would inter-operate between projects if they attempt to address similar problems.

For example, a minority of the functionality we need will be covered directly by the numpy API (e.g. row-wise/column-wise summary statistics) but the majority of it, or at least the harder, more interesting functionality, will involve fairly bespoke algorithms that are specific to the domain and can only dispatch partly through a numpy API, via something like unumpy. What we will need is a way to define how these algorithms work based on the underlying array type and we know that the implementations will be quite different when using Dask, CuPy, or in-memory arrays. I imagine we will need our own DaskBackend, CuPyBackend, etc. implementations and though I noticed several warnings from you guys on not building backends that depend on other backends, this seems like an exception. In other words, our GeneticsDaskBackend would need to force use of the unumpy DaskBackend. Did you guys envision this working differently or am I on the right track?

I was also wondering if you knew of projects specific to some scientific domain that build on uarray/unumpy to support dispatching. I suppose it's early for that, but I wanted to ask because it would be great to have a model to follow if one exists, rather than working through some prototypes myself.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions