Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CK GEMM Backend #1480

Draft
wants to merge 58 commits into
base: rocm6.3_internal_testing
Choose a base branch
from

Commits on Jun 17, 2024

  1. [SOW MS3] Centos stream9 PyTorch image support (ROCm#1090)

    * changes to build Centos stream 9 images
    
    * Added scripts for centos and centos stream images
    
    * Added an extra line
    
    * Add ninja installation
    
    * Optimized code
    
    * Fixes
    
    * Add comment
    
    * Optimized code
    
    * Added AMDGPU mapping for ROCm 5.2 and invalid-url for rocm_baseurl
    
    Co-authored-by: Jithun Nair <[email protected]>
    2 people authored and pruthvistony committed Jun 17, 2024
    Configuration menu
    Copy the full SHA
    e023400 View commit details
    Browse the repository at this point in the history

Commits on Jun 19, 2024

  1. Configuration menu
    Copy the full SHA
    b8a2811 View commit details
    Browse the repository at this point in the history
  2. Temporarily skip test_conv3d_64bit_indexing

    - Rocblas API support is requested
    - SWDEV-383635 & sub task - SWDEV-390218
    pruthvistony authored and dnikolaev-amd committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    59e9341 View commit details
    Browse the repository at this point in the history
  3. Enable tensorpipe with hip_basic backend (ROCm#1135)

    * Add hip_basic tensorpipe support to PyTorch
    
    * Enabling hip_basic for Tensorpipe for pyTorch
    
    * removing upstream tensorpipe module
    
    * Adding ROCm specific tensopipe submodule
    
    * tensorpipe submodule updated
    
    * Update the hip invalid device string
    
    * Added ignore for tensorpipe git submodule
    
    * Moved include of tensorpipe_cuda.h to hipify
    
    * Updates based on review comments
    
    * Defining the variable __HIP_PLATFORM_AMD__
    
    * Enabling the UTs
    
    Co-authored-by: Ronak Malik <[email protected]>
    2 people authored and dnikolaev-amd committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    2bb132f View commit details
    Browse the repository at this point in the history
  4. Updates to build on Jammy

    - Fortran package installation moved after gcc
    - Update libtinfo search code in cmake1
    - Install libstdc++.so
    pruthvistony authored and dnikolaev-amd committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    0b08278 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    6e7704d View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    108bf57 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    15da21a View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    4003496 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    2cfad86 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    3c19bf9 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    b7e47fa View commit details
    Browse the repository at this point in the history
  12. Changes to support docker v23

    Reversed the condition as required
    pruthvistony authored and dnikolaev-amd committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    032320c View commit details
    Browse the repository at this point in the history
  13. [CS9] Updates to CentOS stream 9 build (ROCm#1326)

    - Add missing common_utils.sh
    - Update the install vision part
    - Move to amdgpu rhel 9.3 builds
    - Update to pick python from conda path
    - Add a missing package
    - Add ROCM_PATH and magma
    - Updated repo radeon path
    pruthvistony authored and dnikolaev-amd committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    50d56db View commit details
    Browse the repository at this point in the history
  14. Update to hipify mapping

    pruthvistony authored and dnikolaev-amd committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    17ba54f View commit details
    Browse the repository at this point in the history
  15. Correcting usage of USE_ROCM

    pruthvistony authored and dnikolaev-amd committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    e00045a View commit details
    Browse the repository at this point in the history
  16. Enable gesvda for ROCM >= 6.1 (ROCm#1339)

    This also fixes a problem in gesvd driver when UV is not needed.
    xinyazhang authored and dnikolaev-amd committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    7f3172f View commit details
    Browse the repository at this point in the history
  17. Increase lifespan of test-times files

    - build_environment is hard coded to value from upstream when
      branch for created, since the dev/QA ENV build_environment
      value can be varing
    pruthvistony authored and dnikolaev-amd committed Jun 19, 2024
    Configuration menu
    Copy the full SHA
    a2d6ace View commit details
    Browse the repository at this point in the history

Commits on Jun 20, 2024

  1. Fixes CI build script (ROCm#1350)

    * Fix the parsing of /etc/os-release
    
    The old code parses OS_DISTRO as 'PRETTY_Ubuntu' on Ubuntu and thus
    never links to libtinfo correctly.
    
    * Configurable CMAKE_PREFIX_PATH in CI script.
    xinyazhang authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    00307cc View commit details
    Browse the repository at this point in the history
  2. [NO CP] Temporary dumping of test exec log to stderr

    - This is done as per QA request, needs to be reverted and
      not required to be cherry-picked into later releases.
    pruthvistony authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    3120778 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    9726c26 View commit details
    Browse the repository at this point in the history
  4. Converted NAVI check as a function (ROCm#1364)

    * Moved NAVI check to the test file
    
    * Revised NAVI check as a function
    BLOrange-AMD authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    91125f1 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    623579f View commit details
    Browse the repository at this point in the history
  6. Remove ROCmloops specific test

    pruthvistony authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    b39d5fa View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    6d3494e View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    f02e87f View commit details
    Browse the repository at this point in the history
  9. Skip test_mm_triton_kernel_benchmark (ROCm#1376)

    * Running triton kernel on ROCM only has one GB/s metric reported
    
    * Update test_kernel_benchmark.py
    pragupta authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    c1f1f51 View commit details
    Browse the repository at this point in the history
  10. Implementation of PyTorch ut parsing script - QA helper function (ROC…

    …m#1386)
    
    * Initial implementation of PyTorch ut parsing script
    
    * Extracted path variables
    
    * Use nested dict to save results
    
    * Fixes typo
    
    * Cleanup
    
    * Fixes several issues
    
    * Minor name change
    
    * Update run_pytorch_unit_tests.py
    
    * Added file banners
    
    * Supported running from API
    
    * Added more help info
    
    * Consistent naming
    
    * Format help text
    
    ---------
    
    Co-authored-by: Jithun Nair <[email protected]>
    Co-authored-by: Jithun Nair <[email protected]>
    3 people authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    3720952 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    6f65d22 View commit details
    Browse the repository at this point in the history
  12. PR ROCm#1255 to rocm6.2 release

    ramcherukuri authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    98df198 View commit details
    Browse the repository at this point in the history
  13. [ROCm] skip warp update to 64 for gfx10 and gfx11 (ROCm#1417)

    * Warp update to 64 for NAVI3x is skipped
    
    * adding warp_size to device properties
    
    * adding warp_size to device properties
    ramcherukuri authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    f18c060 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    cb0e9ad View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    b4abc4b View commit details
    Browse the repository at this point in the history
  16. [release/2.1] Skip certificate check for CentOS7 since certificate ex…

    …pired (ROCm#1399)
    
    * Skip certificate check only for CentOS7 since certificate expired
    
    * Naming
    jithunnair-amd authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    c716c2e View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    09b800a View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    a0872c0 View commit details
    Browse the repository at this point in the history
  19. Change Torch extra install requirement

    - PYTORCH_EXTRA_INSTALL_REQUIREMENTS is set in builder repo
    - Remove the PYTORCH_EXTRA_INSTALL_REQUIREMENTS step from this file
    pruthvistony authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    700ee13 View commit details
    Browse the repository at this point in the history
  20. Remove the installation of rocm-llvm-dev package

    - Causing regression - SWDEV-463083
    pruthvistony authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    8f95824 View commit details
    Browse the repository at this point in the history
  21. Fix SWDEV-459623 (ROCm#1428)

    * Fix SWDEV-459623. The Rank of logsumexp Tensor must be 3.
    
    This tensor was considered for internal use only but apparently exposed to UTs.
    
    * Fix for mGPU.
    
    The stream should be selected after picking the current device according
    to input tensor.
    xinyazhang authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    5f9b3f4 View commit details
    Browse the repository at this point in the history
  22. Enable fp8 inductor unit tests (ROCm#1421)

    * Add formal FP8 check in common_cuda.py
    
    * Enable inductor/test_valid_cast
    
    * Support for test_eager_fallback
    
    * allow fnuz types on amax test
    
    * Finalize passing tests vs failing
    
    * Fix fnuz constants in _to_fp8_saturated
    alugorey authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    90df487 View commit details
    Browse the repository at this point in the history
  23. Enable NHWC batchnorm for miopen (ROCm#1400)

    * Enable batchnorm NHWC for MIOpen
    
    * cleanup
    
    * test to compare NHWC MIOpen batchnorm with CPU
    
    * fix 'use_miopen' condition for nhwc miopen
    
    * fix includes
    
    * use native nhwc batchnorm to verify miopen
    
    * remove extra spaces
    
    * remove empty lines
    
    * set PYTORCH_MIOPEN_SUGGEST_NHWC=1 for all test_nn.py test
    dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    4380b15 View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    a390471 View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    6be1d5d View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    31b3681 View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    cefda3a View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    8068d3d View commit details
    Browse the repository at this point in the history
  29. Print consolidated log file for pytorch unit test automation scripts (R…

    …OCm#1433)
    
    * Print consolidated log file for pytorch uts
    
    * Update run_entire_tests subprocess call as well
    
    * lint
    
    * Add ERROR string
    jithunnair-amd authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    5187ca9 View commit details
    Browse the repository at this point in the history
  30. [ROCm] Intra-node all reduce initial implementation (ROCm#1435)

    * Initial commit to port intra_node_comm to ROCm
    
    (cherry picked from commit 48d1c33)
    
    * gpt-fast running now with intra-node comm
    
    (cherry picked from commit 618c54e)
    
    ---------
    
    Co-authored-by: Prachi Gupta <[email protected]>
    2 people authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    0c2f97c View commit details
    Browse the repository at this point in the history
  31. Configuration menu
    Copy the full SHA
    012c13b View commit details
    Browse the repository at this point in the history
  32. rocm6.3 related_commits

    dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    6e45ab1 View commit details
    Browse the repository at this point in the history
  33. caching test_times

    dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    3aa060d View commit details
    Browse the repository at this point in the history
  34. Sync updates from hipify_torch. (ROCm#1168)

    Co-authored-by: Jithun Nair <[email protected]>
    2 people authored and dnikolaev-amd committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    0c5d257 View commit details
    Browse the repository at this point in the history
  35. Configuration menu
    Copy the full SHA
    ecf4e8d View commit details
    Browse the repository at this point in the history

Commits on Jun 26, 2024

  1. Merge pull request ROCm#1436 from ROCm/IFU_CP_06172024

    IFU for rocm6.3_internal_testing
    pruthvistony authored Jun 26, 2024
    Configuration menu
    Copy the full SHA
    8f19207 View commit details
    Browse the repository at this point in the history

Commits on Jun 27, 2024

  1. Configuration menu
    Copy the full SHA
    5de711c View commit details
    Browse the repository at this point in the history

Commits on Jul 8, 2024

  1. Configuration menu
    Copy the full SHA
    4459b67 View commit details
    Browse the repository at this point in the history

Commits on Jul 10, 2024

  1. Configuration menu
    Copy the full SHA
    dd43b9b View commit details
    Browse the repository at this point in the history

Commits on Jul 18, 2024

  1. CK GEMM Backend

    jeffdaily authored and alugorey committed Jul 18, 2024
    Configuration menu
    Copy the full SHA
    1b6b84e View commit details
    Browse the repository at this point in the history