-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support distribution of xarray
#1031
Comments
Coordinates are not necessarily 1D arrays. For example for curvilinear grids covering the Earth's surface one would have to describe positions of (e.g. centers) of grid points as 2D arrays (one for |
Got it, thanks @koldunovn, edited original post. |
Implementation discussed during devs meeting Sept 28. Preliminary scheme:
@Markus-Goetz would be good if you chime in |
tagging @Mystic-Slice as they have shown interest in working on this 👋 |
Interesting discussion over at array-api about single-node parallelism, but xarray also mentioned in distributed execution context. @TomNicholas you might be interested in this effort as well. |
Hi everyone! DXarray:Xarray uses There are two solutions:
I would love to know what you all think. Any suggestion is highly appreciated! |
Hi everyone, thanks for tagging me here! I'm a bit unclear what you would like to achieve here - are you talking about: (a) making Some miscellaneous comments:
This is the point of the
There have been some discussions of wrapping pytorch tensors in xarray. We also have an ongoing project to publicly expose |
Hi @TomNicholas! This should allow users to manipulate huge amounts of data while also being able to work with the more-intuitive Xarray API. @ClaudiaComito should be able to shed more light on this. |
Would the experience of xarray with Dask make creation of the data structure you want? Also there are implementations with GPU support https://xarray.dev/blog/xarray-kvikio |
Meanwhile I have implemented a basic idea of
Things that might get a bit more complicated:
|
@ClaudiaComito in #1154 it seems you started wrapping heat objects inside xarray, which is awesome! I recently improved the documentation on wrapping numpy-like xarrays with xarray objects (pydata/xarray#7911 and pydata/xarray#7951). Those extra pages in the docs aren't released yet, but for now you can view them here (on wrapping numpy-like arrays) and here (on wrapping distributed numpy-like arrays). |
In the branch 1031-support-distribution-of-xarray there is now available:
I will stop here, until we have discussed in the team the "wrapping-approach" proposed by TomNicholas, because such an approach would be much easier to implement (if applicable to Heat). |
This issue is stale because it has been open for 60 days with no activity. |
This issue was closed because it has been inactive for 60 days since being marked as stale. |
See https://docs.xarray.dev/en/stable/
If I understand correctly, an
xarray
object is made up of the actualdata
array (np.ndarray), and1-Dcoordinates
arrays (dictionaries?) that mapdata
dimensions and indices to meaningful physical quantities.For example, if
xarray
is a matrix of coordinates(date, temperature)
, users will be able to performFeature functionality
Enable distribution of xarray object, allow named dimensions, keep track of
coordinates
arrays, one of which will be distributed.Example, :
Check out Pytorch's named tensors functionality.
Additional context
Initiating collaboration with N. Koldunov @koldunovn at Alfred Wegener Institute (Helmholtz centre for polar and marine research).
Also interesting for @kleinert-f, @ben-bou
Tagging @bhagemeier for help with implementation.
The text was updated successfully, but these errors were encountered: