Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Migrate to rockylinux8 / manylinux_2_28_x86_64 #10399

Merged
merged 18 commits into from
Jun 17, 2024

Conversation

hcho3
Copy link
Collaborator

@hcho3 hcho3 commented Jun 7, 2024

Closes #10357

Also:

  • Use gcc-toolset-* instead of devtoolset-*. For now, use GCC 10 to support CUDA 11.8. We will migrate to CUDA 12.x in the near future (Update the default CTK to 12.4. #10370).
  • Use latest version of CMake and Maven.
  • Remove the unused image s390x.

@hcho3 hcho3 marked this pull request as draft June 7, 2024 03:05
Comment on lines +27 to +29
# Disable CMAKE_COMPILE_WARNING_AS_ERROR option temporarily until
# https://github.com/dmlc/xgboost/issues/10400 is fixed
cmake .. ${cmake_args} -DGOOGLE_TEST=ON -DUSE_DMLC_GTEST=ON -DCMAKE_VERBOSE_MAKEFILE=ON -DENABLE_ALL_WARNINGS=ON -DCMAKE_COMPILE_WARNING_AS_ERROR=OFF -GNinja ${cmake_prefix_flag} -DHIDE_CXX_SYMBOLS=ON -DBUILD_DEPRECATED_CLI=ON
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #10400

@hcho3
Copy link
Collaborator Author

hcho3 commented Jun 7, 2024

It looks like we have to upgrade to CUDA 12.x to build latest gRPC: tensorflow/tensorflow#63356

To make this PR easy to review, I will create a separate PR to upgrade gRPC.

Copy link
Contributor

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some cosmetic recommendations, but overall the structure makes sense to me!

If it helps with the decision... in LightGBM we've been using manylinux_2_28 for the Python package since November 2022 (microsoft/LightGBM#5580) and I don't recall getting any reports about it being a problem.

tests/buildkite/build-cuda-with-rmm.sh Outdated Show resolved Hide resolved
tests/buildkite/build-cuda.sh Outdated Show resolved Hide resolved
tests/ci_build/Dockerfile.jvm_cross Outdated Show resolved Hide resolved
@hcho3
Copy link
Collaborator Author

hcho3 commented Jun 7, 2024

Getting a weird link error:

/workspace/xgboost: hidden symbol `_ZNSt10filesystem9_Dir_base7advanceEbRSt10error_code' isn't defined

https://buildkite.com/xgboost/xgboost-ci/builds/5501#018ff4b1-a587-4032-bb80-8b7fadcafa79/132-657

According to https://bugzilla.redhat.com/show_bug.cgi?id=1929043, upgrading to gcc-toolset-10 should fix the issue.

@hcho3 hcho3 marked this pull request as ready for review June 8, 2024 05:01
@trivialfis

This comment was marked as outdated.

@hcho3
Copy link
Collaborator Author

hcho3 commented Jun 17, 2024

@trivialfis What do you think? Should we consider releasing 2.1.0 with manylinux_2_28 standard?
Doing so would ensure that XGBoost doesn't depend on core libs that will soon be outdated. LightGBM has been using manylinux_2_28 successfully for a while now, so the risk of breaking things is likely low.

Copy link
Member

@trivialfis trivialfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for managing the binary build! Please help backport it to 2.1.

@hcho3 hcho3 merged commit bc3747b into dmlc:master Jun 17, 2024
29 checks passed
@hcho3 hcho3 deleted the update_docker branch June 17, 2024 19:07
@hcho3 hcho3 mentioned this pull request Jun 17, 2024
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] Migration to manylinux_2_28_x86_64 and rockylinux8
3 participants