We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug Not sure it is the case for all examples, but for the mortgage ETL + XGBoost example there are some non-trivial discrepancies. Example: python script has udfs: https://github.com/NVIDIA/spark-rapids-examples/blob/main/examples/XGBoost-Examples/mortgage/python/com/nvidia/spark/examples/mortgage/etl.py#L22-L23 while the notebook(s) implement these using Spark SQL directly: https://github.com/NVIDIA/spark-rapids-examples/blob/main/examples/XGBoost-Examples/mortgage/notebooks/python/MortgageETL.ipynb?short_path=2af22cf#L454-L478 There are some other differences. Looks like the scripts may be lagging the notebooks.
Steps/Code to reproduce bug N/A
Expected behavior Notebooks and python script versions should ideally be aligned (or at least documented why they don't).
Environment details (please complete the following information) N/A
The text was updated successfully, but these errors were encountered:
@nvliyuan Do you remember who wrote these examples? I can't recall the reason, but there should be.
Sorry, something went wrong.
Yes, the same example with different implementations should keep the same logic, will draft a pr to fix it.
nvliyuan
No branches or pull requests
Describe the bug
Not sure it is the case for all examples, but for the mortgage ETL + XGBoost example there are some non-trivial discrepancies. Example:
python script has udfs: https://github.com/NVIDIA/spark-rapids-examples/blob/main/examples/XGBoost-Examples/mortgage/python/com/nvidia/spark/examples/mortgage/etl.py#L22-L23
while the notebook(s) implement these using Spark SQL directly:
https://github.com/NVIDIA/spark-rapids-examples/blob/main/examples/XGBoost-Examples/mortgage/notebooks/python/MortgageETL.ipynb?short_path=2af22cf#L454-L478
There are some other differences. Looks like the scripts may be lagging the notebooks.
Steps/Code to reproduce bug
N/A
Expected behavior
Notebooks and python script versions should ideally be aligned (or at least documented why they don't).
Environment details (please complete the following information)
N/A
The text was updated successfully, but these errors were encountered: