-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[jvm-packages][breaking]Rework xgboost4j jvm packages for spark #10617
base: master
Are you sure you want to change the base?
Conversation
hey @trivialfis @hcho3 @dotbg, please help review this PR, thx very much. |
Would you please split this pull requests into smaller chunks so that it is easier to review? |
It's hard to separate it, since it's completely jvm rewrite. The whole code architecture has been re-orged. and totally different implementation. |
- Remove xgboost4j-gpu and move its functionality to xgboost4j-spark-gpu - Remove any soft links in xgboost4j-spark-gpu - Abstract an XGBoost Estimator which handles the common functionality for all xgboost estimators. - Support XGBoostRanker - Remove any unnecessary ETL in XGBoost - Fix the missing value usage - Support uber jar for xgboost4j-spark - Support uber jar for xgboost4j-spark-gpu - Rework GPU plugin - More sannity test to ensure the training/transform results are same between CPU and GPU.
I won't review the code, since I don't think I am qualified to review such a large change in the Java/Scala code base. The best I can do is to review the changes in the CI/CD pipelines. |
Thx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a few comments, but I am rather far from done yet
@@ -133,7 +133,7 @@ | |||
<property name="braceAdjustment" value="0"/> | |||
<property name="caseIndent" value="2"/> | |||
<property name="throwsIndent" value="4"/> | |||
<property name="lineWrappingIndentation" value="4"/> | |||
<property name="lineWrappingIndentation" value="2"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this change in another PR
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding> | ||
<maven.compiler.source>1.8</maven.compiler.source> | ||
<maven.compiler.target>1.8</maven.compiler.target> | ||
<flink.version>1.19.0</flink.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<flink.version>1.19.0</flink.version> | |
<flink.version>1.19.1</flink.version> |
<junit.version>4.13.2</junit.version> | ||
<spark.version>3.5.1</spark.version> | ||
<spark.version.gpu>3.5.1</spark.version.gpu> | ||
<fasterxml.jackson.version>2.15.2</fasterxml.jackson.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<fasterxml.jackson.version>2.15.2</fasterxml.jackson.version> | |
<fasterxml.jackson.version>2.17.2</fasterxml.jackson.version> |
<maven.wagon.http.retryHandler.count>5</maven.wagon.http.retryHandler.count> | ||
<log.capi.invocation>OFF</log.capi.invocation> | ||
<use.cuda>OFF</use.cuda> | ||
<spark.rapids.version>24.04.1</spark.rapids.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<spark.rapids.version>24.04.1</spark.rapids.version> | |
<spark.rapids.version>24.06.0</spark.rapids.version> |
<use.cuda>OFF</use.cuda> | ||
<spark.rapids.version>24.04.1</spark.rapids.version> | ||
<cudf.classifier>cuda12</cudf.classifier> | ||
<scalatest.version>3.2.18</scalatest.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<scalatest.version>3.2.18</scalatest.version> | |
<scalatest.version>3.2.19</scalatest.version> |
<plugin> | ||
<groupId>org.apache.maven.plugins</groupId> | ||
<artifactId>maven-jar-plugin</artifactId> | ||
<version>3.4.1</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<version>3.4.1</version> | |
<version>3.4.2</version> |
@@ -82,19 +82,27 @@ This file is divided into 3 sections: | |||
</check> | |||
|
|||
<check level="error" class="org.scalastyle.scalariform.ClassNamesChecker" enabled="true"> | |||
<parameters><parameter name="regex"><![CDATA[[A-Z][A-Za-z]*]]></parameter></parameters> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's have these changes in another PR.
"device" -> device | ||
) | ||
) | ||
).setNumRound(10).setNumWorkers(numWorkers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will it still work if we specify the map values instead?
HI @dotbg, Thx for review, some other reviewers complained this PR is quite huge PR, So I will break it down to smaller PRs. Will let you know. Thx again. |
About this PR
Features and Fixes
Introduce an abstract XGBoost Estimator:
Update to the latest XGBoost parameters
Implement support for XGBoostRanker:
Address the missing value handling:
Consolidate xgboost4j-gpu and xgboost4j-spark-gpu:
Remove any ETL operations in XGBoost
Create uber jars for xgboost4j-spark and xgboost4j-spark-gpu:
Rework the GPU plugin:
Expand sanity tests for CPU and GPU consistency:
Breaking changes
Remove some unused parameters like Rabit related parameters and train_test_ratio. etc.
Separate XGBoost-spark parameter from the whole XGBoost parameters, and make the constructor of XGBoost estimator only accept the XGBoost parameters instead of parameters for xgboost4j-spark like,
or
XGBoostRegressor doesn't support ranking problem anymore, the functionality of ranking has been moved to the new estimator XGBoostRanker. And remove the ETL (like grouping) for ranking problem.
Removed the xgboost4j-gpu module and move the functionality into xgboost4j-spark-gpu
Test this PR