From 44aa2e07b765f6ec83fc48b9e62558eb1cd5584e Mon Sep 17 00:00:00 2001 From: Stanley Tsang Date: Wed, 17 Jan 2024 10:56:44 -0700 Subject: [PATCH] 6.1 bulk update from develop branch 2024-1-16 (#328) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Develop stream 2023-10-27 (#309) * Accumulator types changed for reduce and test_hipcub_device_reduce fixed for new thread operators * Add thread operators test * Bump CUB and Thrust versions to 2.1.0 * change how we use the rocprim::host_warp_size * update changelog * move host_warp_size_wrapper out of the HIPCUB_HOST_WARP_THREADS macro * update changelog to be clearer * add changes related to __int128_t support * finish int128 support add tests for block and device_radix_sort add assert_bit_eq for (u)int128 vectors * Test large indices for DeviceReduce * Fix clang format * Include FetchContent in new ROCmCMakeBuildToolsDependency cmake file * Use _ENABLE_EXTENDED_ALIGNED_STORAGE for windows build in rmake.py * Update CHANGELOG to ROCm 6.1 --------- Co-authored-by: Bence Parajdi * StreamHPC 2023-11-21 (DeviceMemcpy::Batched) (#314) * ci: use build instead rocm-build and nvcc-build tags This allows the build job to be performed by any runner configured for building, instead of the ROCm-specialized builder. As the target architectures are specified ahead of time, the GPU is not needed during the build process, and may be performed by any builder. * feat: Add interface for batched memcpy from rocPRIM and CUB * style(device_memcpy): improve formatting --------- Co-authored-by: Robin Voetter Co-authored-by: Gergely Mészáros * Bump cryptography from 41.0.4 to 41.0.6 in /docs/.sphinx (#316) Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.4 to 41.0.6. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](https://github.com/pyca/cryptography/compare/41.0.4...41.0.6) --- updated-dependencies: - dependency-name: cryptography dependency-type: indirect ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update rocm-docs-core to 0.30.3 (#319) * Update rocm-docs-core to 0.30.3 * Update link to hipCUB docs in README * Remove doc artifacts * Bump gitpython from 3.1.37 to 3.1.41 in /docs/.sphinx (#320) Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.37 to 3.1.41. - [Release notes](https://github.com/gitpython-developers/GitPython/releases) - [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES) - [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.37...3.1.41) --- updated-dependencies: - dependency-name: gitpython dependency-type: indirect ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * 6.0 final mergeback to develop (#321) * Separate gfx942 specific code (#289) Co-authored-by: Stanley Tsang * Split rocm-cmake dependency out before hip include (#293) * Split rocm-cmake dependency out before hip include * Update comments * Fix cpp-check reported issues Fixed a number of issues that static analysis picked up: - Made some functions const since they don't modify member state - Made some parameters const, since they're never modified - Fixes for several benchmark/test functions - Removed unused variable declarations - Added missing input data transfer from host to device - Added some member variables to constructor initializer list - Added override keyword in several places - Fixed up item placeholders in some printf statements * Fix cpp-check reported issues * Removed host to data transfer from memcpy benchmark. Since this benchmark only tests memcpy performance between device buffers, we don't really need to copy data into these from the host. * update googlebenchmark version (#302) * Avoid a segmentation fault when clearing cached blocks (#297) (#310) Co-authored-by: Tom Benson * Include FetchContent before usage (#308) * 6.0 cherry pick for changelog and version update (#313) * Update documentation and version for 6.0 * Fix version --------- Co-authored-by: Eiden Yoshida <47196116+eidenyoshida@users.noreply.github.com> Co-authored-by: Lauren Wrubleski Co-authored-by: Wayne Franz Co-authored-by: Tom Benson * Bump jinja2 from 3.1.2 to 3.1.3 in /docs/.sphinx (#322) Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.2 to 3.1.3. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/jinja/compare/3.1.2...3.1.3) --- updated-dependencies: - dependency-name: jinja2 dependency-type: indirect ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Adding CODEOWNERS file (#324) * Bump rocm-docs-core[api_reference] in /docs/.sphinx (#326) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.3 to 0.31.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.3...v0.31.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Standardize documentation for ReadtheDocs (#325) * Update links in README.md - Update the links to other ROCm repositories that are now in the ROCm org. - Replace link to "rocm.github.io" with "rocm.docs.amd.com". * Update package version --------- Signed-off-by: dependabot[bot] Co-authored-by: Beatriz Navidad Vilches <61422851+Beanavil@users.noreply.github.com> Co-authored-by: Bence Parajdi Co-authored-by: Nara Co-authored-by: Robin Voetter Co-authored-by: Gergely Mészáros Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sam Wu Co-authored-by: Eiden Yoshida <47196116+eidenyoshida@users.noreply.github.com> Co-authored-by: Lauren Wrubleski Co-authored-by: Wayne Franz Co-authored-by: Tom Benson Co-authored-by: David Galiffi --- .github/CODEOWNERS | 1 + .github/dependabot.yml | 2 +- .readthedocs.yaml | 4 ++-- CHANGELOG.md | 13 ++++++++----- CMakeLists.txt | 2 +- README.md | 20 ++++++++++---------- docs/.gitignore | 11 ++++------- docs/.sphinx/requirements.in | 1 - docs/conf.py | 21 +++++++++++++++++++-- docs/{.doxygen => doxygen}/Doxyfile | 4 ++-- docs/license.rst | 4 ++++ docs/{.sphinx => sphinx}/_toc.yml.in | 5 ++++- docs/sphinx/requirements.in | 1 + docs/{.sphinx => sphinx}/requirements.txt | 6 +++--- 14 files changed, 60 insertions(+), 35 deletions(-) create mode 100644 .github/CODEOWNERS delete mode 100644 docs/.sphinx/requirements.in rename docs/{.doxygen => doxygen}/Doxyfile (99%) create mode 100644 docs/license.rst rename docs/{.sphinx => sphinx}/_toc.yml.in (62%) create mode 100644 docs/sphinx/requirements.in rename docs/{.sphinx => sphinx}/requirements.txt (97%) diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS new file mode 100644 index 00000000..9b4ed4c4 --- /dev/null +++ b/.github/CODEOWNERS @@ -0,0 +1 @@ +* @stanleytsang-amd @umfranzw @RobsonRLemos @lawruble13 diff --git a/.github/dependabot.yml b/.github/dependabot.yml index 95e8b2ba..0e0a252e 100644 --- a/.github/dependabot.yml +++ b/.github/dependabot.yml @@ -6,7 +6,7 @@ version: 2 updates: - package-ecosystem: "pip" # See documentation for possible values - directory: "/docs/.sphinx" # Location of package manifests + directory: "/docs/sphinx" # Location of package manifests open-pull-requests-limit: 10 schedule: interval: "daily" diff --git a/.readthedocs.yaml b/.readthedocs.yaml index e2bf130c..da1d3ae4 100644 --- a/.readthedocs.yaml +++ b/.readthedocs.yaml @@ -10,10 +10,10 @@ formats: [htmlzip, pdf, epub] python: install: - - requirements: docs/.sphinx/requirements.txt + - requirements: docs/sphinx/requirements.txt build: - os: ubuntu-20.04 + os: ubuntu-22.04 tools: python: "3.8" apt_packages: diff --git a/CHANGELOG.md b/CHANGELOG.md index 950525f2..831a335e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,7 +2,7 @@ See README.md on how to build the hipCUB documentation using Doxygen. -## (Unreleased) hipCUB-2.13.1 for ROCm 6.1.0 +## (Unreleased) hipCUB-3.1.0 for ROCm 6.1.0 ### Changed - CUB backend references CUB and Thrust version 2.1.0. - Updated `HIPCUB_HOST_WARP_THREADS` macro definition to match `host_warp_size` changes from rocPRIM 3.0. @@ -13,19 +13,22 @@ See README.md on how to build the hipCUB documentation using Doxygen. ### Added - Added interface `DeviceMemcpy::Batched` for batched memcpy from rocPRIM and CUB. -## (Unreleased) hipCUB-2.13.1 for ROCm 5.7.0 +## hipCUB-3.0.0 for ROCm 6.0.0 +### Changed +- Removed `DOWNLOAD_ROCPRIM`, forcing rocPRIM to download can be done with `DEPENDENCIES_FORCE_DOWNLOAD`. + +## hipCUB-2.13.2 for ROCm 5.7.0 ### Changed - CUB backend references CUB and Thrust version 2.0.1. - Fixed `DeviceSegmentedReduce::ArgMin` and `DeviceSegmentedReduce::ArgMax` by returning the segment-relative index instead of the absolute one. - Fixed `DeviceSegmentedReduce::ArgMin` for inputs where the segment minimum is smaller than the value returned for empty segments. An equivalent fix is applied to `DeviceSegmentedReduce::ArgMax`. -- Removed `DOWNLOAD_ROCPRIM`, forcing rocPRIM to download can be done with `DEPENDENCIES_FORCE_DOWNLOAD`. ### Known Issues - `debug_synchronous` no longer works on CUDA platform. `CUB_DEBUG_SYNC` should be used to enable those checks. - `DeviceReduce::Sum` does not compile on CUDA platform for mixed extended-floating-point/floating-point InputT and OutputT types. - `DeviceHistogram::HistogramEven` fails on CUDA platform for `[LevelT, SampleIteratorT] = [int, int]`. - `DeviceHistogram::MultiHistogramEven` fails on CUDA platform for `[LevelT, SampleIteratorT] = [int, int/unsigned short/float/double]` and `[LevelT, SampleIteratorT] = [float, double]`. -## (Unreleased) hipCUB-2.13.1 for ROCm 5.5.0 +## hipCUB-2.13.1 for ROCm 5.5.0 ### Added - Benchmarks for `BlockShuffle`, `BlockLoad`, and `BlockStore`. ### Changed @@ -36,7 +39,7 @@ See README.md on how to build the hipCUB documentation using Doxygen. - `BlockRadixRankMatch` is currently broken under the rocPRIM backend. - `BlockRadixRankMatch` with a warp size that does not exactly divide the block size is broken under the CUB backend. -## (Unreleased) hipCUB-2.13.0 for ROCm 5.4.0 +## hipCUB-2.13.0 for ROCm 5.4.0 ### Added - CMake functionality to improve build parallelism of the test suite that splits compilation units by function or by parameters. diff --git a/CMakeLists.txt b/CMakeLists.txt index 930a4790..a4b796c6 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -98,7 +98,7 @@ if(BUILD_ADDRESS_SANITIZER) endif() # Setup VERSION -set(VERSION_STRING "2.13.1") +set(VERSION_STRING "3.1.0") rocm_setup_version(VERSION ${VERSION_STRING}) # Print configuration summary diff --git a/README.md b/README.md index a05fef21..69d75bb7 100644 --- a/README.md +++ b/README.md @@ -1,23 +1,23 @@ # hipCUB -hipCUB is a thin wrapper library on top of [rocPRIM](https://github.com/ROCmSoftwarePlatform/rocPRIM) or +hipCUB is a thin wrapper library on top of [rocPRIM](https://github.com/ROCm/rocPRIM) or [CUB](https://github.com/thrust/cub). It enables developers to port a project using the CUB library to the -[HIP](https://github.com/ROCm-Developer-Tools/HIP) layer to run on AMD hardware. In the [ROCm](https://rocm.github.io/) +[HIP](https://github.com/ROCm/HIP) layer to run on AMD hardware. In the [ROCm](https://www.github.com/ROCm/ROCm/) environment, hipCUB uses the rocPRIM library as the backend. However, on CUDA platforms it uses CUB instead. ## Documentation -Information about the library API and other user topics can be found in the [hipCUB documentation](https://hipcub.readthedocs.io/en/latest). +Information about the library API and other user topics can be found in the [hipCUB documentation](https://rocm.docs.amd.com/projects/hipCUB/en/latest/index.html). ## Requirements * Git * CMake (3.16 or later) * For AMD GPUs: - * AMD [ROCm](https://rocm.github.io/install.html) platform (1.8.0 or later) - * Including [HIP-clang](https://github.com/ROCm-Developer-Tools/HIP/blob/master/INSTALL.md#hip-clang) compiler, which must be + * AMD [ROCm](https://rocm.docs.amd.com/en/latest/) platform (1.8.0 or later) + * Including [HIP-clang](https://github.com/ROCm/HIP/blob/master/INSTALL.md#hip-clang) compiler, which must be set as C++ compiler on ROCm platform. - * [rocPRIM](https://github.com/ROCmSoftwarePlatform/rocPRIM) library + * [rocPRIM](https://github.com/ROCm/rocPRIM) library * Automatically downloaded and built by CMake script. * Requires CMake 3.16.9 or later. * For NVIDIA GPUs: @@ -41,7 +41,7 @@ Optional: ## Build And Install ```shell -git clone https://github.com/ROCmSoftwarePlatform/hipCUB.git +git clone https://github.com/ROCm/hipCUB.git # Go to hipCUB directory, create and go to the build directory. cd hipCUB; mkdir build; cd build @@ -83,7 +83,7 @@ make package Initial support for HIP on Windows has been added. To install, use the provided rmake.py python script: ```shell -git clone https://github.com/ROCmSoftwarePlatform/hipCUB.git +git clone https://github.com/ROCm/hipCUB.git cd hipCUB # the -i option will install rocPRIM to C:\hipSDK by default @@ -185,7 +185,7 @@ cd hipCUB; cd build cd hipCUB; cd docs # Install required pip packages -python3 -m pip install -r .sphinx/requirements.txt +python3 -m pip install -r sphinx/requirements.txt # Build the documentation python3 -m sphinx -T -E -b html -d _build/doctrees -D language=en . _build/html @@ -197,7 +197,7 @@ python3 -m http.server ## Support -Bugs and feature requests can be reported through [the issue tracker](https://github.com/ROCmSoftwarePlatform/hipCUB/issues). +Bugs and feature requests can be reported through [the issue tracker](https://github.com/ROCm/hipCUB/issues). ## Contributions and License diff --git a/docs/.gitignore b/docs/.gitignore index a395d6cc..c349ea24 100644 --- a/docs/.gitignore +++ b/docs/.gitignore @@ -1,8 +1,5 @@ /_build/ -/_images/ -/_static/ -/_templates/ -/.doxygen/docBin -/.doxygen/hipCUB.tag -/.sphinx/_toc.yml -/api +/doxygen/html +/doxygen/xml +/doxygen/hipCUB.tag +/sphinx/_toc.yml diff --git a/docs/.sphinx/requirements.in b/docs/.sphinx/requirements.in deleted file mode 100644 index 63ee84f2..00000000 --- a/docs/.sphinx/requirements.in +++ /dev/null @@ -1 +0,0 @@ -rocm-docs-core[api_reference]>=0.20.0 diff --git a/docs/conf.py b/docs/conf.py index cf4a47fa..133736cb 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -4,11 +4,28 @@ # list see the documentation: # https://www.sphinx-doc.org/en/master/usage/configuration.html +import re + from rocm_docs import ROCmDocs +with open('../CMakeLists.txt', encoding='utf-8') as f: + match = re.search(r'set\(VERSION_STRING\s+\"?([0-9.]+)[^0-9.]+', f.read()) + if not match: + raise ValueError("VERSION not found!") + version_number = match[1] +left_nav_title = f"hipCUB {version_number} Documentation" + +# for PDF output on Read the Docs +project = "hipCUB Documentation" +author = "Advanced Micro Devices, Inc." +copyright = "Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved." +version = version_number +release = version_number + +external_toc_path = "./sphinx/_toc.yml" -docs_core = ROCmDocs("hipCUB Documentation") -docs_core.run_doxygen() +docs_core = ROCmDocs(left_nav_title) +docs_core.run_doxygen(doxygen_root="doxygen", doxygen_path="doxygen/xml") docs_core.enable_api_reference() docs_core.setup() diff --git a/docs/.doxygen/Doxyfile b/docs/doxygen/Doxyfile similarity index 99% rename from docs/.doxygen/Doxyfile rename to docs/doxygen/Doxyfile index 5fc712cb..6f50272d 100644 --- a/docs/.doxygen/Doxyfile +++ b/docs/doxygen/Doxyfile @@ -58,7 +58,7 @@ PROJECT_LOGO = # entered, it will be relative to the location where doxygen was started. If # left blank the current directory will be used. -OUTPUT_DIRECTORY = docBin +OUTPUT_DIRECTORY = . # If the CREATE_SUBDIRS tag is set to YES then doxygen will create 4096 sub- # directories (in 2 levels) under the output directory of each output format and @@ -2094,7 +2094,7 @@ TAGFILES = # tag file that is based on the input files it reads. See section "Linking to # external documentation" for more information about the usage of tag files. -GENERATE_TAGFILE = docBin/html/tagfile.xml +GENERATE_TAGFILE = html/tagfile.xml # If the ALLEXTERNALS tag is set to YES, all external class will be listed in # the class index. If set to NO, only the inherited external classes will be diff --git a/docs/license.rst b/docs/license.rst new file mode 100644 index 00000000..60fbe859 --- /dev/null +++ b/docs/license.rst @@ -0,0 +1,4 @@ +License +======= + +.. include:: ../LICENSE.txt diff --git a/docs/.sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in similarity index 62% rename from docs/.sphinx/_toc.yml.in rename to docs/sphinx/_toc.yml.in index c31e0f26..5d0853d1 100644 --- a/docs/.sphinx/_toc.yml.in +++ b/docs/sphinx/_toc.yml.in @@ -3,4 +3,7 @@ root: index subtrees: - entries: - - file: .doxygen/docBin/html/index + - file: doxygen/html/index + - caption: About + entries: + - file: license diff --git a/docs/sphinx/requirements.in b/docs/sphinx/requirements.in new file mode 100644 index 00000000..4767b4f8 --- /dev/null +++ b/docs/sphinx/requirements.in @@ -0,0 +1 @@ +rocm-docs-core[api_reference]==0.31.0 diff --git a/docs/.sphinx/requirements.txt b/docs/sphinx/requirements.txt similarity index 97% rename from docs/.sphinx/requirements.txt rename to docs/sphinx/requirements.txt index 2a9654bd..1e35e71f 100644 --- a/docs/.sphinx/requirements.txt +++ b/docs/sphinx/requirements.txt @@ -47,7 +47,7 @@ fastjsonschema==2.16.3 # via rocm-docs-core gitdb==4.0.10 # via gitpython -gitpython==3.1.37 +gitpython==3.1.41 # via rocm-docs-core idna==3.4 # via requests @@ -57,7 +57,7 @@ importlib-metadata==6.8.0 # via sphinx importlib-resources==6.1.0 # via rocm-docs-core -jinja2==3.1.2 +jinja2==3.1.3 # via # myst-parser # sphinx @@ -118,7 +118,7 @@ requests==2.31.0 # via # pygithub # sphinx -rocm-docs-core[api-reference]==0.27.0 +rocm-docs-core[api-reference]==0.31.0 # via # -r requirements.in # rocm-docs-core