Releases: helmholtz-analytics/heat
Heat 1.5 Release: distributed matrix factorization and more
Heat 1.5 Release Notes
- Overview
- Highlights
- Performance Improvements
- Sparse
- Signal Processing
- RNG
- Statistics
- Manipulations
- I/O
- Machine Learning
- Deep Learning
- Other Updates
- Contributors
Overview
With Heat 1.5 we release the first set of features developed within the ESAPCA project funded by the European Space Agency (ESA).
The main focus of this release is on distributed linear algebra operations, such as tall-skinny SVD, batch matrix multiplication, and triangular solver. We also introduce vectorization via vmap
across MPI processes, and batch-parallel random number generation as default for distributed operations.
This release also includes a new class for distributed Compressed Sparse Column matrices, paving the way for future implementation of distributed sparse matrix multiplication.
On the performance side, our new array redistribution via MPI Custom Datatypes provides significant speed-up in operations that require it, such as FFTs (see Dalcin et al., 2018).
We are grateful to our community of users, students, open-source contributors, the European Space Agency and the Helmholtz Association for their support and feedback.
Highlights
- [ESAPCA] Distributed tall-skinny SVD:
ht.linalg.svd
(by @mrfh92) - Distributed batch matrix multiplication:
ht.linalg.matmul
(by @FOsterfeld) - Distributed solver for triangular systems:
ht.linalg.solve_triangular
(by @FOsterfeld) - Vectorization across MPI processes:
ht.vmap
(by @mrfh92)
Other Changes
Performance Improvements
- #1493 Redistribution speed-up via MPI Custom Datatypes available by default in
ht.resplit
(by @JuanPedroGHM)
Sparse
- #1377 New class: Distributed Compressed Sparse Column Matrix
ht.sparse.DCSC_matrix()
(by @Mystic-Slice)
Signal Processing
- #1515 Support batch 1-d convolution in
ht.signal.convolve
(by @ClaudiaComito)
RNG
Statistics
- #1420
Support sketched percentile/median for large datasets withht.percentile(sketched=True)
(andht.median
) (by @mrhf92) - #1510 Support multiple axes for distributed
ht.percentile
andht.median
(by @ClaudiaComito)
Manipulations
- #1419 Implement distributed
unfold
operation (by @FOsterfeld)
I/O
- #1602 Improve load balancing when loading .npy files from path (by @Reisii)
- #1551 Improve load balancing when loading .csv files from path (by @Reisii)
Machine Learning
- #1593 Improved batch-parallel clustering
ht.cluster.BatchParallelKMeans
andht.cluster.BatchParallelKMedians
(by @mrfh92)
Deep Learning
Other Updates
- #1618 Support mpi4py 4.x.x (by @JuanPedroGHM)
Contributors
@mrfh92, @FOsterfeld, @JuanPedroGHM, @Mystic-Slice, @ClaudiaComito, @Reisii, @mtar and @krajsek
Heat 1.5.0-rc1: Pre-Release
Changes
Cluster
Data
IO
- #1602 Improved load balancing when loading .npy files from path. (by @Reisii)
- #1551 Improved load balancing when loading .csv files from path. (by @Reisii)
Linear Algebra
- #1261 Batched matrix multiplication. (by @FOsterfeld)
- #1504 Add solver for triangular systems. (by @FOsterfeld)
Manipulations
- #1419 Implement distributed
unfold
operation. (by @FOsterfeld)
Random
Signal
- #1515 Support batch 1-d convolution in
ht.signal.convolve
. (by @ClaudiaComito)
Statistics
- #1510 Support multiple axes for
ht.percentile
. (by @ClaudiaComito)
Sparse
- #1377 Distributed Compressed Sparse Column Matrix. (by @Mystic-Slice)
Other
- #1618 Support mpi4py 4.x.x (by @JuanPedroGHM)
Contributors
@ClaudiaComito, @FOsterfeld, @JuanPedroGHM, @Reisii, @mrfh92, @mtar and @krajsek
Heat 1.4.2 - Maintenance Release
Changes
Interoperability
- #1467, #1525 Support PyTorch 2.3.1 (by @mtar)
- #1535 Address test failures after netCDF4 1.7.1, numpy 2 releases (by @ClaudiaComito)
Contributors
@ClaudiaComito, @mrfh92 and @mtar
Heat 1.4.1: Bug fix release
Changes
Bug fixes
- #1472 DNDarrays returned by
_like
functions default to same device as input DNDarray (by @mrfh92, @ClaudiaComito)
Maintenance
Contributors
Interactive HPC tutorials, distributed FFT, batch-parallel clustering, support PyTorch 2.2.2
Changes
Documentation
- #1406 New tutorials for interactive parallel mode for both HPC and local usage (by @ClaudiaComito)
🔥 Features
- #1288 Batch-parallel K-means and K-medians (by @mrfh92)
- #1228 Introduce in-place-operators for
arithmetics.py
(by @LScheib) - #1218 Distributed Fast Fourier Transforms (by @ClaudiaComito)
Bug fixes
- #1363
ht.array
constructor respects implicit torch device when copy is set to false (by @JuanPedroGHM) - #1216 Avoid unnecessary gathering of distributed operand (by @samadpls)
- #1329 Refactoring of QR: stabilized Gram-Schmidt for split=1 and TS-QR for split=0 (by @mrfh92)
Interoperability
- #1418 and #1290: Support PyTorch 2.2.2 (by @mtar)
- #1315 and #1337: Fix some NumPy deprecations in the core and statistics tests (by @FOsterfeld)
Contributors
@ClaudiaComito, @FOsterfeld, @JuanPedroGHM, @LScheib, @mrfh92, @mtar, @samadpls
Bug fixes, Docker documentation update
Bug fixes
- #1259 Bug-fix for
ht.regression.Lasso()
on GPU (by @mrfh92) - #1201 Fix
ht.diff
for 1-element-axis edge case (by @mtar)
Changes
Interoperability
- #1257 Docker release 1.3.x update (by @JuanPedroGHM)
Maintenance
- #1274 Update version before release (by @ClaudiaComito)
- #1267 Unit tests: Increase tolerance for
ht.allclose
onht.inv
operations for all torch versions (by @ClaudiaComito) - #1266 Sync
pre-commit
configuration withmain
branch (by @ClaudiaComito) - #1264 Fix Pytorch release tracking workflows (by @mtar)
- #1234 Update sphinx package requirements (by @mtar)
- #1187 Create configuration file for Read the Docs (by @mtar)
Contributors
@ClaudiaComito, @JuanPedroGHM, @bhagemeier, @mrfh92 and @mtar
Scalable SVD, GSoC`22 contributions, Docker image, PyTorch 2 support, AMD GPUs acceleration
This release includes many important updates (see below). We particularly would like to thank our enthusiastic GSoC2022 / tentative GSoC2023 contributors @Mystic-Slice @neosunhan @Sai-Suraj-27 @shahpratham @AsRaNi1 @Ishaan-Chandak 🙏🏼 Thank you so much!
Highlights
- #1155 Support PyTorch 2.0.1 (by @ClaudiaComito)
- #1152 Support AMD GPUs (by @mtar)
- #1126 Distributed hierarchical SVD (by @mrfh92)
- #1028 Introducing the
sparse
module: Distributed Compressed Sparse Row Matrix (by @Mystic-Slice) - Performance improvements:
- #1125 distributed
heat.reshape()
speed-up (by @ClaudiaComito) - #1141
heat.pow()
speed-up when exponent isint
(by @ClaudiaComito @coquelin77 ) - #1119
heat.array()
default tocopy=None
(e.g., only if necessary) (by @ClaudiaComito @neosunhan )
- #1125 distributed
- #970 Dockerfile and accompanying documentation (by @bhagemeier)
Changelog
Array-API compliance / Interoperability
- #1154 Introduce
DNDarray.__array__()
method for interoperability withnumpy
,xarray
(by @ClaudiaComito) - #1147 Adopt NEP29, drop support for PyTorch 1.7, Python 3.6 (by @mtar)
- #1119
ht.array()
default tocopy=None
(e.g., only if necessary) (by @ClaudiaComito) - #1020 Implement
broadcast_arrays
,broadcast_to
(by @neosunhan) - #1008 API: Rename
keepdim
kwarg tokeepdims
(by @neosunhan) - #788 Interface for DPPY interoperability (by @coquelin77 @fschlimb )
New Features
- #1126 Distributed hierarchical SVD (by @mrfh92)
- #1020 Implement
broadcast_arrays
,broadcast_to
(by @neosunhan) - #983 Signal processing: fully distributed 1D convolution (by @shahpratham)
- #1063 add eq to Device (by @mtar)
Bug Fixes
- #1141
heat.pow()
speed-up when exponent isint
(by @ClaudiaComito) - #1136 Fixed PyTorch version check in
sparse
module (by @Mystic-Slice) - #1098 Validates number of dimensions in input to
ht.sparse.sparse_csr_matrix
(by @Ishaan-Chandak) - #1095 Convolve with distributed kernel on multiple GPUs (by @shahpratham)
- #1094 Fix division precision error in
random
module (by @Mystic-Slice) - #1075 Fixed initialization of DNDarrays communicator in some routines (by @AsRaNi1)
- #1066 Verify input object type and layout + Supporting tests (by @Mystic-Slice)
- #1037 Distributed weighted
average()
along tuple of axes: shape ofweights
to match shape of input (by @Mystic-Slice)
Benchmarking
- #1137 Continous Benchmarking of runtime (by @JuanPedroGHM)
Documentation
- #1150 Refactoring for efficiency and readability (by @Sai-Suraj-27)
- #1130 Reintroduce Quick Start (by @ClaudiaComito)
- #1079 A better README file (by @Sai-Suraj-27)
Linear Algebra
Contributors
@AsRaNi1, @ClaudiaComito, @Ishaan-Chandak, @JuanPedroGHM, @Mystic-Slice, @Sai-Suraj-27, @bhagemeier, @coquelin77, @mrfh92, @mtar, @neosunhan, @shahpratham
Bug fixes, support OpenMPI>=4.1.2, support PyTorch 1.13.1
Changes
Communication
- #1058 Fix edge-case contiguity mismatch for Allgatherv (by @ClaudiaComito)
Contributors
Support PyTorch 1.13, Lanczos decomposition fix, bug fixes
Changes
- #1048 Support PyTorch 1.13.0 on branch release/1.2.x (by @github-actions)
🐛 Bug Fixes
- #1038 Lanczos decomposition
linalg.solver.lanczos
: Support double precision, complex data types (by @ClaudiaComito) - #1034
ht.array
, closed loophole allowingDNDarray
construction with incompatible shapes of local arrays (by @Mystic-Slice)
Linear Algebra
- #1038 Lanczos decomposition
linalg.solver.lanczos
: Support double precision, complex data types (by @ClaudiaComito)
🧪 Testing
- #1025 mirror repository on gitlab + ci (by @mtar)
- #1014 fix: set cuda rng state on gpu tests for test_random.py (by @JuanPedroGHM)
Contributors
@ClaudiaComito, @JuanPedroGHM, @Mystic-Slice, @coquelin77, @mtar, @github-actions, @github-actions[bot]
v1.2.0: GSoC22, introducing `signal` module, parallel I/O and more
Highlights
- We have been selected as a mentoring organization for Google Summer of Code, and we already have many new contributors (see below). Thank you!
- Heat now supports PyTorch 1.11
- Gearing up to support data-intensive signal processing: introduced
signal
module and memory-distributed 1-D convolution withht.convolve()
- Parallel I/O: you can now parallelize writing out to CSV file with
ht.save_csv()
. - Introduced more flexibility in memory-distributed binary operations.
- Expanded functionalities in
linalg
,manipulations
modules.
What's Changed
- Bug/825 setitem slice dndarrays by @coquelin77 in #826
- Features/807 roll by @mtar in #829
- implement vecdot by @mtar in #840
- Enhancement/798 logical dndarrray by @mtar in #851
- add moveaxis by @mtar in #854
- add swapaxes by @mtar in #853
- norm implementation by @mtar in #846
- Features/178 tile by @ClaudiaComito in #673
- Features/torch proxy by @ClaudiaComito in #856
- add normal, standard_normal by @mtar in #858
- add signbit by @mtar in #862
- add sign, sgn by @mtar in #827
- vdot implementation by @mtar in #842
- Bugfix/529 lasso example by @bhagemeier in #876
- fix binary_op on operands with single element by @mtar in #868
- conjugate method in DNDarray by @mtar in #885
- add cross by @mtar in #850
- Feature/337 determinant by @mtar in #877
- Features/746 print0 print toggle by @coquelin77 in #816
- Feature/338 matrix inverse by @mtar in #875
randint
accept ints for 'size' by @mtar in #916- Support PyTorch 1.11.0 by @github-actions in #932
- 750 save csv v2 by @bhagemeier in #941
- added duplicate comm by @Dhruv454000 in #940
- changed documentation small fix by @Dhruv454000 in #956
- add digitize/bucketize by @mtar in #928
- Improve save_csv string formatting by @bhagemeier in #948
- Random: Replaced factories.array with DNDarray by @shahpratham in #960
- Features/30 convolve by @lucaspataro in #595
- Add
out
andwhere
args forht.div
by @neosunhan in #945
New Contributors
- @Dhruv454000 made their first contribution in #940
- @shahpratham made their first contribution in #960
- @neosunhan made their first contribution in #945
Full Changelog: v1.1.0...v1.2.0