Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the python notebooks for the customer churn examples #452

Merged
merged 1 commit into from
Oct 25, 2024

Conversation

NvTimLiu
Copy link
Collaborator

1, Add the variables 'SPARK_MASTER_URL' and 'DATA_ROOT' to support automated testing from CI/CD jobs.

2, 'output_prefix' is NOT referenced in below lines

image

1, Add the variables 'SPARK_MASTER_URL' and 'DATA_ROOT' to support automated testing from CI/CD jobs.

2, 'output_prefix' is NOT referenced in below lines

Signed-off-by: timl <[email protected]>
@NvTimLiu NvTimLiu added the bug Something isn't working label Oct 24, 2024
@NvTimLiu NvTimLiu self-assigned this Oct 24, 2024
@NvTimLiu
Copy link
Collaborator Author

Build PASS:
image

@YanxuanLiu
Copy link
Collaborator

Does it still require same version for spark cluster and pyspark?

@NvTimLiu
Copy link
Collaborator Author

NvTimLiu commented Oct 24, 2024

Does it still require same version for spark cluster and pyspark?

Yes we need to keep the pyspark and spark-bin version the same,

actually we only need to point the PYTHONPATH to the current spark binary [export PYTHONPATH=$SPARK_HOME/python].

@YanxuanLiu
Copy link
Collaborator

Does it still require same version for spark cluster and pyspark?

Yes we need to keep the pyspark and spark-bin consistent,

actually we only need to point the PYTHONPATH to the current spark binary [export PYTHONPATH=$SPARK_HOME/python].

Can we avoid spark session not found error by pointing PYTHONPATH only but ignoring the pyspark version installed?

@NvTimLiu
Copy link
Collaborator Author

Does it still require same version for spark cluster and pyspark?

Yes we need to keep the pyspark and spark-bin consistent,
actually we only need to point the PYTHONPATH to the current spark binary [export PYTHONPATH=$SPARK_HOME/python].

Can we avoid spark session not found error by pointing PYTHONPATH only but ignoring the pyspark version installed?

yes, right

Copy link
Collaborator

@YanxuanLiu YanxuanLiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We also need to update the README to introduce the environment we need to execute the notebook.

Copy link
Collaborator

@nvliyuan nvliyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@NvTimLiu NvTimLiu merged commit 119897c into NVIDIA:branch-24.12 Oct 25, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants