Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update docs for xgboost1.7.1 and add python notebooks #252

Merged
merged 15 commits into from
Dec 19, 2022

Conversation

nvliyuan
Copy link
Collaborator

@nvliyuan nvliyuan commented Nov 21, 2022

Signed-off-by: liyuan [email protected]
this pr is to:

  1. update getting started docs for xgboost1.7.0
  2. add python notebooks
  3. update out of date aws doc fix AWS getting started doc is out of date #244
  4. update xgboost1.7.0 performance fix since dmlc xgboost 1.7.0 released, we need to update the codes/docs/notebooks #241

@nvliyuan nvliyuan marked this pull request as draft November 21, 2022 04:37
@nvliyuan nvliyuan changed the title update docs for xgboost1.7.0 and add python notebooks update docs for xgboost1.7.0 and add python notebooks[WIP] Nov 21, 2022
@nvliyuan nvliyuan changed the title update docs for xgboost1.7.0 and add python notebooks[WIP] update docs for xgboost1.7.1 and add python notebooks[WIP] Dec 7, 2022
@nvliyuan nvliyuan marked this pull request as ready for review December 9, 2022 09:28
@nvliyuan nvliyuan self-assigned this Dec 9, 2022
@nvliyuan nvliyuan changed the title update docs for xgboost1.7.1 and add python notebooks[WIP] update docs for xgboost1.7.1 and add python notebooks Dec 14, 2022
NvTimLiu
NvTimLiu previously approved these changes Dec 14, 2022
Copy link
Collaborator

@NvTimLiu NvTimLiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI job against the notebooks got PASS, +1

3. Install the XGBoost, cudf-cu11, numpy libraries on all nodes before running XGBoost application.

``` bash
pip install xgboost
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to install scikit-learn?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, add 'pip install scikit-learn '

``` bash
pip install xgboost
pip install cudf-cu11 --extra-index-url=https://pypi.ngc.nvidia.com
pip install numpy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still install numpy?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, XGBoost dependent on numpy
image

``` bash
pip install xgboost
pip install cudf-cu11 --extra-index-url=https://pypi.ngc.nvidia.com
pip install numpy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with the previous comment

Most data scientists spend a lot of time not only on
Training models but also processing the large amounts of data needed to train these models.
As you can see below, XGBoost training on GPUs can be up to 10X and data processing using
RAPIDS Accelerator can also be accelerated with an end-to-end speed-up of 7X on GPU compared to CPU.
As you can see below, Pyspark+XGBoost training on GPUs can be up to 13X and data processing using
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also have a benchmark testing for xgboost-jvm-gpu?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, but I think we can add it in another PR.

Copy link
Collaborator

@wbo4958 wbo4958 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants