Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] dask-cuda v24.10 #1390

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open

[RELEASE] dask-cuda v24.10 #1390

wants to merge 19 commits into from

Commits on Jul 19, 2024

  1. Configuration menu
    Copy the full SHA
    6db4b71 View commit details
    Browse the repository at this point in the history
  2. Merge pull request #1361 from rapidsai/branch-24.08

    Forward-merge branch-24.08 into branch-24.10
    GPUtester committed Jul 19, 2024
    Configuration menu
    Copy the full SHA
    4852856 View commit details
    Browse the repository at this point in the history

Commits on Jul 24, 2024

  1. Merge pull request #1365 from rapidsai/branch-24.08

    Forward-merge branch-24.08 into branch-24.10
    GPUtester committed Jul 24, 2024
    Configuration menu
    Copy the full SHA
    5353018 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c5d57e6 View commit details
    Browse the repository at this point in the history

Commits on Jul 25, 2024

  1. Merge pull request #1368 from jameslamb/branch-24.10-merge-24.08

    Merge branch-24.08 into branch-24.10
    AyodeAwe committed Jul 25, 2024
    Configuration menu
    Copy the full SHA
    14efa4d View commit details
    Browse the repository at this point in the history

Commits on Jul 31, 2024

  1. Replace cuDF (de)serializer with cuDF spill-aware (de)serializer (#1369)

    Replace cuDF (de)serializer with cuDF spill-aware (de)serializer, using both together should be avoided as that will cause excessive spilling.
    
    Additionally add:
    
    - Missing test of cuDF internal spill mechanism with `LocalCUDACluster`;
    - `dask cuda worker` warning to alert the user that cuDF spilling mechanism requires client/scheduler to enable it as well.
    
    Closes #1363 .
    
    Authors:
      - Peter Andreas Entschev (https://github.com/pentschev)
    
    Approvers:
      - Mads R. B. Kristensen (https://github.com/madsbk)
    
    URL: #1369
    pentschev committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    c0cd465 View commit details
    Browse the repository at this point in the history

Commits on Aug 7, 2024

  1. Merge pull request #1372 from rapidsai/branch-24.08

    Forward-merge branch-24.08 into branch-24.10
    GPUtester committed Aug 7, 2024
    Configuration menu
    Copy the full SHA
    8d42cf3 View commit details
    Browse the repository at this point in the history

Commits on Aug 8, 2024

  1. Update pre-commit hooks (#1373)

    This PR updates pre-commit hooks to the latest versions that are supported without causing style check errors.
    
    Authors:
      - Kyle Edwards (https://github.com/KyleFromNVIDIA)
    
    Approvers:
      - James Lamb (https://github.com/jameslamb)
    
    URL: #1373
    KyleFromNVIDIA committed Aug 8, 2024
    Configuration menu
    Copy the full SHA
    00c37dc View commit details
    Browse the repository at this point in the history

Commits on Aug 22, 2024

  1. Drop Python 3.9 support (#1377)

    Contributes to rapidsai/build-planning#88
    
    Finishes the work of dropping Python 3.9 support.
    
    This project stopped building / testing against Python 3.9 as of rapidsai/shared-workflows#235.
    This PR updates configuration and docs to reflect that.
    
    ## Notes for Reviewers
    
    ### How I tested this
    
    Checked that there were no remaining uses like this:
    
    ```shell
    git grep -E '3\.9'
    git grep '39'
    git grep 'py39'
    ```
    
    And similar for variations on Python 3.8 (to catch things that were missed the last time this was done).
    
    Authors:
      - James Lamb (https://github.com/jameslamb)
    
    Approvers:
      - https://github.com/jakirkham
    
    URL: #1377
    jameslamb committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    49ebabc View commit details
    Browse the repository at this point in the history

Commits on Aug 23, 2024

  1. Remove NumPy <2 pin (#1375)

    This PR removes the NumPy<2 pin which is expected to work for
    RAPIDS projects once CuPy 13.3.0 is released (CuPy 13.2.0 had
    some issues preventing the use with NumPy 2).
    
    Authors:
      - Sebastian Berg (https://github.com/seberg)
      - https://github.com/jakirkham
    
    Approvers:
      - https://github.com/jakirkham
    
    URL: #1375
    seberg committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    4a02fcc View commit details
    Browse the repository at this point in the history

Commits on Aug 27, 2024

  1. Update rapidsai/pre-commit-hooks (#1379)

    This PR updates rapidsai/pre-commit-hooks to the version 0.4.0.
    
    Authors:
      - Kyle Edwards (https://github.com/KyleFromNVIDIA)
    
    Approvers:
      - James Lamb (https://github.com/jameslamb)
    
    URL: #1379
    KyleFromNVIDIA committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    b519e39 View commit details
    Browse the repository at this point in the history

Commits on Aug 30, 2024

  1. [Benchmark] Add parquet read benchmark (#1371)

    Adds new benchmark for parquet read performance using a `LocalCUDACluster`. The user can pass in `--key` and `--secret` options to specify S3 credentials.
    
    E.g.
    ```
    $ python ./local_read_parquet.py --devs 0,1,2,3,4,5,6,7 --filesystem fsspec --type gpu --file-count 48 --aggregate-files
    
    Parquet read benchmark
    --------------------------------------------------------------------------------
    Path                      | s3://dask-cudf-parquet-testing/dedup_parquet
    Columns                   | None
    Backend                   | cudf
    Filesystem                | fsspec
    Blocksize                 | 244.14 MiB
    Aggregate files           | True
    Row count                 | 372066
    Size on disk              | 1.03 GiB
    Number of workers         | 8
    ================================================================================
    Wall clock                | Throughput
    --------------------------------------------------------------------------------
    36.75 s                   | 28.78 MiB/s
    21.29 s                   | 49.67 MiB/s
    17.91 s                   | 59.05 MiB/s
    ================================================================================
    Throughput                | 41.77 MiB/s +/- 7.81 MiB/s
    Bandwidth                 | 0 B/s +/- 0 B/s
    Wall clock                | 25.32 s +/- 8.20 s
    ================================================================================
    ...
    ```
    
    **Notes**:
    - S3 Performance generally scales with the number of workers (multiplied the number of threads per worker)
    - The example shown above was not executed from an EC2 instance
    - The example shown above *should* perform better after rapidsai/cudf#16657
    - Using `--filesystem arrow` together with `--type gpu` performs well, but depends on rapidsai/cudf#16684
    
    Authors:
      - Richard (Rick) Zamora (https://github.com/rjzamora)
    
    Approvers:
      - Mads R. B. Kristensen (https://github.com/madsbk)
      - Peter Andreas Entschev (https://github.com/pentschev)
    
    URL: #1371
    rjzamora committed Aug 30, 2024
    Configuration menu
    Copy the full SHA
    1cc4d0b View commit details
    Browse the repository at this point in the history

Commits on Sep 5, 2024

  1. Add support for Python 3.12 (#1380)

    Contributes to rapidsai/build-planning#40
    
    This PR adds support for Python 3.12.
    
    ## Notes for Reviewers
    
    This is part of ongoing work to add Python 3.12 support across RAPIDS.
    It temporarily introduces a build/test matrix including Python 3.12, from rapidsai/shared-workflows#213.
    
    A follow-up PR will revert back to pointing at the `branch-24.10` branch of `shared-workflows` once all
    RAPIDS repos have added Python 3.12 support.
    
    ### This will fail until all dependencies have been updates to Python 3.12
    
    CI here is expected to fail until all of this project's upstream dependencies support Python 3.12.
    
    This can be merged whenever all CI jobs are passing.
    
    Authors:
      - James Lamb (https://github.com/jameslamb)
    
    Approvers:
      - Bradley Dice (https://github.com/bdice)
    
    URL: #1380
    jameslamb committed Sep 5, 2024
    Configuration menu
    Copy the full SHA
    5d9a4cc View commit details
    Browse the repository at this point in the history

Commits on Sep 9, 2024

  1. enable Python 3.12 tests on PRs (#1382)

    Follow-up to #1380.
    
    Now that both `cudf` (rapidsai/cudf#16745) and `ucxx` (rapidsai/ucxx#276) have Python 3.12 wheels available, it should be possible to test `dask-cuda` against Python 3.12 in CI.
    
    This proposes that.
    
    Authors:
      - James Lamb (https://github.com/jameslamb)
    
    Approvers:
      - Bradley Dice (https://github.com/bdice)
    
    URL: #1382
    jameslamb committed Sep 9, 2024
    Configuration menu
    Copy the full SHA
    72d51e9 View commit details
    Browse the repository at this point in the history

Commits on Sep 11, 2024

  1. Add notes on cudf spilling to docs (#1383)

    Updates the dask-cuda documentation to include notes on native cuDF spilling, since it is often the best spilling approach for ETL with Dask cuDA (please feel free to correct me if I'm wrong).
    
    Authors:
      - Richard (Rick) Zamora (https://github.com/rjzamora)
    
    Approvers:
      - Peter Andreas Entschev (https://github.com/pentschev)
    
    URL: #1383
    rjzamora committed Sep 11, 2024
    Configuration menu
    Copy the full SHA
    dc168d7 View commit details
    Browse the repository at this point in the history

Commits on Sep 16, 2024

  1. Fix typo in spilling documentation (#1384)

    Small follow-up to #1383
    
    - Fixes a typo in a link that references the "Spilling from device" page
    - Small tweaks to the spilling discussion on the "best practices" page
    
    Authors:
      - Richard (Rick) Zamora (https://github.com/rjzamora)
    
    Approvers:
      - Peter Andreas Entschev (https://github.com/pentschev)
    
    URL: #1384
    rjzamora committed Sep 16, 2024
    Configuration menu
    Copy the full SHA
    d5b70f0 View commit details
    Browse the repository at this point in the history

Commits on Sep 17, 2024

  1. Update to flake8 7.1.1. (#1385)

    We need to update flake8 to fix a false-positive that appears with older flake8 versions on Python 3.12.
    
    Authors:
      - Bradley Dice (https://github.com/bdice)
    
    Approvers:
      - James Lamb (https://github.com/jameslamb)
      - Benjamin Zaitlen (https://github.com/quasiben)
      - Peter Andreas Entschev (https://github.com/pentschev)
    
    URL: #1385
    bdice committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    dbb50a5 View commit details
    Browse the repository at this point in the history

Commits on Sep 18, 2024

  1. Configuration menu
    Copy the full SHA
    637c504 View commit details
    Browse the repository at this point in the history

Commits on Sep 24, 2024

  1. Configuration menu
    Copy the full SHA
    1c84a6a View commit details
    Browse the repository at this point in the history