Releases: rapidsai/dask-cuda
Releases · rapidsai/dask-cuda
v23.02.01
🚨 Breaking Changes
- Pin
dask
anddistributed
for release (#1106) @galipremsagar
🐛 Bug Fixes
- Pin to pynvml < 11.5 (#1123) @wence-
- pre-commit: Update isort version to 5.12.0 (#1098) @wence-
- explicit-comms: don't mix
-
and_
in config (#1096) @madsbk - Update
cudf.Buffer
pointer access method (#1094) @pentschev - Update tests for Python 3.10 (#1086) @pentschev
- Use
pkgutil.iter_modules
to get un-imported module fortest_pre_import
(#1085) @charlesbluca - Make proxy tests with
LocalCUDACluster
asynchronous (#1084) @pentschev - Ensure consistent results from
safe_sizeof()
in test (#1071) @madsbk - Pass missing argument to groupby benchmark compute (#1069) @mattf
- Reorder channel priority. (#1067) @bdice
- Fix owner check when the owner is a cupy array (#1061) @wence-
🛠️ Improvements
- Pin
dask
anddistributed
for release (#1106) @galipremsagar - Update shared workflow branches (#1105) @ajschmidt8
- Proxify: make duplicate check optional (#1101) @madsbk
- Fix whitespace & add URLs in
pyproject.toml
(#1092) @jakirkham - pre-commit: spell, whitespace, and mypy check (#1091) @madsbk
- shuffle: use cuDF's
partition_by_hash()
when available (#1090) @madsbk - add initial docs build (#1089) @AjayThorve
- Remove
--get-cluster-configuration
option, check for scheduler indask cuda config
(#1088) @charlesbluca - Add timeout to
pytest
command (#1082) @ajschmidt8 - shuffle-benchmark: add
--partition-distribution
(#1081) @madsbk - Ensure tests run for Python
3.10
(#1080) @ajschmidt8 - Use TrackingResourceAdaptor to get better debug info (#1079) @madsbk
- Improve shuffle-benchmark (#1074) @madsbk
- Update builds for CUDA
11.8
and Python310
(#1072) @ajschmidt8 - Shuffle by partition to reduce memory usage significantly (#1068) @madsbk
- Enable copy_prs. (#1063) @bdice
- Add GitHub Actions Workflows (#1062) @bdice
- Unpin
dask
anddistributed
for development (#1060) @galipremsagar - Switch to the new dask CLI (#981) @jacobtomlinson
v23.02.00
🚨 Breaking Changes
- Pin
dask
anddistributed
for release (#1106) @galipremsagar
🐛 Bug Fixes
- pre-commit: Update isort version to 5.12.0 (#1098) @wence-
- explicit-comms: don't mix
-
and_
in config (#1096) @madsbk - Update
cudf.Buffer
pointer access method (#1094) @pentschev - Update tests for Python 3.10 (#1086) @pentschev
- Use
pkgutil.iter_modules
to get un-imported module fortest_pre_import
(#1085) @charlesbluca - Make proxy tests with
LocalCUDACluster
asynchronous (#1084) @pentschev - Ensure consistent results from
safe_sizeof()
in test (#1071) @madsbk - Pass missing argument to groupby benchmark compute (#1069) @mattf
- Reorder channel priority. (#1067) @bdice
- Fix owner check when the owner is a cupy array (#1061) @wence-
🛠️ Improvements
- Pin
dask
anddistributed
for release (#1106) @galipremsagar - Update shared workflow branches (#1105) @ajschmidt8
- Proxify: make duplicate check optional (#1101) @madsbk
- Fix whitespace & add URLs in
pyproject.toml
(#1092) @jakirkham - pre-commit: spell, whitespace, and mypy check (#1091) @madsbk
- shuffle: use cuDF's
partition_by_hash()
when available (#1090) @madsbk - add initial docs build (#1089) @AjayThorve
- Remove
--get-cluster-configuration
option, check for scheduler indask cuda config
(#1088) @charlesbluca - Add timeout to
pytest
command (#1082) @ajschmidt8 - shuffle-benchmark: add
--partition-distribution
(#1081) @madsbk - Ensure tests run for Python
3.10
(#1080) @ajschmidt8 - Use TrackingResourceAdaptor to get better debug info (#1079) @madsbk
- Improve shuffle-benchmark (#1074) @madsbk
- Update builds for CUDA
11.8
and Python310
(#1072) @ajschmidt8 - Shuffle by partition to reduce memory usage significantly (#1068) @madsbk
- Enable copy_prs. (#1063) @bdice
- Add GitHub Actions Workflows (#1062) @bdice
- Unpin
dask
anddistributed
for development (#1060) @galipremsagar - Switch to the new dask CLI (#981) @jacobtomlinson
v22.12.00
🚨 Breaking Changes
🐛 Bug Fixes
- Fix
parse_memory_limit
function call (#1055) @galipremsagar - Work around Jupyter errors in CI (#1041) @pentschev
- Fix version constraint (#1036) @wence-
- Support the new
Buffer
in cudf (#1033) @madsbk - Install Dask nightly last in CI (#1029) @pentschev
- Fix recorded time in merge benchmark (#1028) @wence-
- Switch pre-import not found test to sync definition (#1026) @pentschev
- Make local_directory a required argument for spilling impls (#1023) @wence-
- Fixes for handling MIG devices (#950) @pentschev
📖 Documentation
- Merge 22.10 into 22.12 (#1016) @pentschev
- Merge 22.08 into 22.10 (#1010) @pentschev
🚀 New Features
- Allow specifying fractions as RMM pool initial/maximum size (#1021) @pentschev
- Add feature to get cluster configuration (#1006) @quasiben
- Add benchmark option to use dask-noop (#994) @wence-
🛠️ Improvements
- Ensure linting checks for whole repo in CI (#1053) @pentschev
- Pin
dask
anddistributed
for release (#1046) @galipremsagar - Remove
pytest-asyncio
dependency (#1045) @pentschev - Migrate as much as possible to
pyproject.toml
(#1035) @jakirkham - Re-implement shuffle using staging (#1030) @madsbk
- Explicit-comms-shuffle: fine control of task scheduling (#1025) @madsbk
- Remove stale labeler (#1024) @raydouglass
- Unpin
dask
anddistributed
for development (#1005) @galipremsagar - Support cuDF's built-in spilling (#984) @madsbk
v22.10.00
🐛 Bug Fixes
- Revert "Update rearrange_by_column patch for explicit comms" (#1001) @rjzamora
- Address CI failures caused by upstream distributed and cupy changes (#993) @rjzamora
- DeviceSerialized.reduce_ex: convert frame to numpy arrays (#977) @madsbk
📖 Documentation
🚀 New Features
🛠️ Improvements
- Pin
dask
anddistributed
for release (#1003) @galipremsagar - Update rearrange_by_column patch for explicit comms (#992) @rjzamora
- benchmarks: Add option to suppress output of point to point data (#985) @wence-
- Unpin
dask
anddistributed
for development (#971) @galipremsagar
v22.08.00
🚨 Breaking Changes
🐛 Bug Fixes
- Fix
distributed
error related toloop_in_thread
(#963) @galipremsagar - Add
__rmatmul__
toProxyObject
(#960) @jakirkham - Always use versioneer command classes in setup.py (#948) @wence-
- Do not dispatch removed
cudf.Frame._index
object (#947) @pentschev - Fix useless property (#944) @wence-
- LocalCUDACluster's memory limit:
None
means no limit (#943) @madsbk - ProxyManager: support
memory_limit=None
(#941) @madsbk - Remove deprecated
loop
kwarg toNanny
inCUDAWorker
(#934) @pentschev - Import
cleanup
fixture intest_dask_cuda_worker.py
(#924) @pentschev
📖 Documentation
- Switch docs to use common
js
&css
code (#967) @galipremsagar - Switch
language
fromNone
to"en"
in docs build (#939) @galipremsagar
🚀 New Features
- Add communications bandwidth to benchmarks (#938) @pentschev
🛠️ Improvements
- Pin
dask
&distributed
for release (#965) @galipremsagar - Test memory_limit=None for CUDAWorker (#946) @wence-
- benchmarks: Record total number of workers in dataframe (#945) @wence-
- Benchmark refactoring: tidy data and multi-node capability via
--scheduler-file
(#940) @wence- - Add util functions to simplify printing benchmarks results (#937) @pentschev
- Add --multiprocessing-method option to benchmarks (#933) @wence-
- Remove click pinning (#932) @charlesbluca
- Remove compiler variables (#929) @ajschmidt8
- Unpin
dask
&distributed
for development (#927) @galipremsagar
v22.06.00
🚨 Breaking Changes
- Upgrade
numba
pinning to be in-line with rest of rapids (#912) @galipremsagar
🐛 Bug Fixes
- Reduce
test_cudf_cluster_device_spill
test and speed it up (#918) @pentschev - Update ImportError tests with --pre-import (#914) @pentschev
- Add xfail mark to
test_pre_import_not_found
(#908) @pentschev - Increase spill tests timeout to 30 seconds (#901) @pentschev
- Fix errors related with
distributed.worker.memory.terminate
(#900) @pentschev - Skip tests on import error for some optional packages (#899) @pentschev
- Update auto host_memory computation when threads per worker > 1 (#896) @ayushdg
- Update black to 22.3.0 (#889) @charlesbluca
- Remove legacy
check_python_3
(#886) @pentschev
📖 Documentation
- Add documentation for
RAPIDS_NO_INITIALIZE
(#898) @charlesbluca - Use upstream warning functions for CUDA initialization (#894) @charlesbluca
🛠️ Improvements
- Pin
dask
anddistributed
for release (#922) @galipremsagar - Pin
dask
&distributed
for release (#916) @galipremsagar - Upgrade
numba
pinning to be in-line with rest of rapids (#912) @galipremsagar - Removing test of
cudf.merge_sorted()
(#905) @madsbk - Disable
include-ignored
coverage warnings (#903) @pentschev - Fix ci/local script (#902) @Ethyling
- Use conda to build python packages during GPU tests (#897) @Ethyling
- Pull
requirements.txt
into Conda recipe (#893) @jakirkham - Unpin
dask
&distributed
for development (#892) @galipremsagar - Build packages using mambabuild (#846) @Ethyling
v22.04.00
🚨 Breaking Changes
🐛 Bug Fixes
- Resolve build issues / consistency with conda-forge packages (#883) @charlesbluca
- Increase test_worker_force_spill_to_disk timeout (#857) @pentschev
📖 Documentation
- Remove description from non-existing
--nprocs
CLI argument (#852) @pentschev
🚀 New Features
- Add --pre-import/pre_import argument (#854) @pentschev
- Remove support for UCX < 1.11.1 (#830) @pentschev
🛠️ Improvements
- Raise
ImportError
when platform is not Linux (#885) @pentschev - Temporarily disable new
ops-bot
functionality (#880) @ajschmidt8 - Pin
dask
&distributed
(#878) @galipremsagar - Upgrade min
dask
&distributed
versions (#872) @galipremsagar - Add
.github/ops-bot.yaml
config file (#871) @ajschmidt8 - Make Dask CUDA work with the new WorkerMemoryManager abstraction (#870) @shwina
- Implement ProxifyHostFile.evict() (#862) @madsbk
- Introduce incompatible-types and enables spilling of CuPy arrays (#856) @madsbk
- Spill to disk clean up (#853) @madsbk
- ProxyObject to support matrix multiplication (#849) @madsbk
- Unpin max dask and distributed (#847) @galipremsagar
- test_gds: skip if GDS is not available (#845) @madsbk
- ProxyObject implement array_function (#843) @madsbk
- Add option to track RMM allocations (#842) @shwina
v22.02.00
🐛 Bug Fixes
- Ignore
DeprecationWarning
fromdistutils.Version
classes (#823) @pentschev - Handle explicitly disabled UCX transports (#820) @pentschev
- Fix regex pattern to match to in test_on_demand_debug_info (#819) @pentschev
- Fix skipping GDS test if cucim is not installed (#813) @pentschev
- Unpin Dask and Distributed versions (#810) @pentschev
- Update to UCX-Py 0.24 (#805) @pentschev
📖 Documentation
- Fix Dask-CUDA version to 22.02 (#835) @jakirkham
- Merge branch-21.12 into branch-22.02 (#829) @pentschev
- Clarify
LocalCUDACluster
'sn_workers
docstrings (#812) @pentschev
🚀 New Features
- Pin
dask
&distributed
versions (#832) @galipremsagar - Expose rmm-maximum_pool_size argument (#827) @VibhuJawa
- Simplify UCX configs, permitting UCX_TLS=all (#792) @pentschev
🛠️ Improvements
- Add avg and std calculation for time and throughput (#828) @quasiben
- sizeof test: increase tolerance (#825) @madsbk
- Query UCX-Py from gpuCI versioning service (#818) @pentschev
- Standardize Distributed config separator in get_ucx_config (#806) @pentschev
- Fixed
ProxyObject.__del__
to use the new Disk IO API from #791 (#802) @madsbk - GPUDirect Storage (GDS) support for spilling (#793) @madsbk
- Disk IO interface (#791) @madsbk
v21.12.00
🐛 Bug Fixes
- Remove automatic
doc
labeler (#807) @pentschev - Add create_cuda_context UCX config from Distributed (#801) @pentschev
- Ignore deprecation warnings from pkg_resources (#784) @pentschev
- Fix parsing of device by UUID (#780) @pentschev
- Avoid creating CUDA context in LocalCUDACluster parent process (#765) @pentschev
- Remove gen_cluster spill tests (#758) @pentschev
- Update memory_pause_fraction in test_spill (#757) @pentschev
📖 Documentation
- Add troubleshooting page with PCI Bus ID issue description (#777) @pentschev
🚀 New Features
- Handle UCX-Py FutureWarning on UCX < 1.11.1 deprecation (#799) @pentschev
- Pin max
dask
&distributed
versions (#794) @galipremsagar - Update to UCX-Py 0.23 (#752) @pentschev
🛠️ Improvements
- Fix spill-to-disk triggered by Dask explicitly (#800) @madsbk
- Fix Changelog Merge Conflicts for
branch-21.12
(#797) @ajschmidt8 - Use unittest.mock.patch for all os.environ tests (#787) @pentschev
- Logging when RMM allocation fails (#782) @madsbk
- Tally IDs instead of device buffers directly (#779) @madsbk
- Avoid proxy object aliasing (#775) @madsbk
- Test of sizeof proxy object (#774) @madsbk
- gc.collect when spilling on demand (#771) @madsbk
- Reenable explicit comms tests (#770) @madsbk
- Simplify JIT-unspill and writing docs (#768) @madsbk
- Increase CUDAWorker close timeout (#764) @pentschev
- Ignore known but expected test warnings (#759) @pentschev
- Spilling on demand (#756) @madsbk
- Revert "Temporarily skipping some tests because of a bug in Dask (#753)" (#754) @madsbk
- Temporarily skipping some tests because of a bug in Dask (#753) @madsbk
- Removing the
FrameProxyObject
workaround (#751) @madsbk - Use cuDF Frame instead of Table (#748) @madsbk
- Remove proxy object locks (#747) @madsbk
- Unpin
dask
&distributed
in CI (#742) @galipremsagar - Update SSHCluster usage in benchmarks with new CUDAWorker (#326) @pentschev
v21.10.00
🐛 Bug Fixes
- Drop test setting UCX global options via Dask config (#738) @pentschev
- Prevent CUDA context errors when testing on single-GPU (#737) @pentschev
- Handle
ucp
import error duringinitialize()
(#729) @pentschev - Check if CUDA context was created in distributed.comm.ucx (#722) @pentschev
- Fix registering correct dispatches for
cudf.Index
(#718) @galipremsagar - Register
percentile_lookup
forFrameProxyObject
(#716) @galipremsagar - Leave interface unset when ucx_net_devices unset in LocalCUDACluster (#711) @pentschev
- Update to UCX-Py 0.22 (#710) @pentschev
- Missing fixes to Distributed config namespace refactoring (#703) @pentschev
- Reset UCX-Py after rdmacm tests run (#702) @pentschev
- Skip DGX InfiniBand tests when "rc" transport is unavailable (#701) @pentschev
- Update UCX config namespace (#695) @pentschev
- Bump isort hook version (#682) @charlesbluca
📖 Documentation
- Update more docs for UCX 1.11+ (#720) @pentschev
- Forward-merge branch-21.08 to branch-21.10 (#707) @jakirkham
🚀 New Features
- Warn if CUDA context is created on incorrect device with
LocalCUDACluster
(#719) @pentschev - Add
--benchmark-json
option to all benchmarks (#700) @charlesbluca - Remove Distributed tests from CI (#699) @pentschev
- Add device memory limit argument to benchmarks (#683) @charlesbluca
- Support for LocalCUDACluster with MIG (#674) @akaanirban
🛠️ Improvements
- Pin max
dask
anddistributed
versions to2021.09.1
(#735) @galipremsagar - Implements a ProxyManagerDummy for convenience (#733) @madsbk
- Add
__array_ufunc__
support forProxyObject
(#731) @galipremsagar - Use
has_cuda_context
from Distributed (#723) @pentschev - Fix deadlock and simplify proxy tracking (#712) @madsbk
- JIT-unspill: support spilling to/from disk (#708) @madsbk
- Tests: replacing the obsolete cudf.testing._utils.assert_eq calls (#706) @madsbk
- JIT-unspill: warn when spill to disk triggers (#705) @madsbk
- Remove max version pin for
dask
&distributed
on development branch (#693) @galipremsagar - ENH Replace gpuci_conda_retry with gpuci_mamba_retry (#675) @dillon-cullinan