Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spline producer url unreachable from spark #1354

Closed
learnerr101 opened this issue Jul 23, 2024 · 8 comments
Closed

Spline producer url unreachable from spark #1354

learnerr101 opened this issue Jul 23, 2024 · 8 comments

Comments

@learnerr101
Copy link

I used the TLDR configuration and packages given in https://absaoss.github.io/spline/. za.co.absa.spline.agent.spark:spark-2.4-spline-agent-bundle_2.12:0.5.2 and 'spark.spline.producer.url' as 'http://localhost:9090/producer'

But I get this error in my spark job:

Screenshot 2024-07-24 003938

Configurations:
Spark -> 2.4.2
Scala -> 2.12
Commands used:
spark-submit --packages za.co.absa.spline.agent.spark:spark-2.4-spline-agent-bundle_2.12:0.5.2 --conf "spark.sql.queryExecutionListeners=za.co.absa.spline.harvester.listener.SplineQueryExecutionListener" --conf "spark.spline.producer.url=http://localhost:9090/producer" AB_2.py

I have also tried spark 3.0.0 with compatible artifacts and Java version but I'm getting the same error.

@cerveada cerveada added this to Spline Jul 23, 2024
@github-project-automation github-project-automation bot moved this to New in Spline Jul 23, 2024
@cerveada
Copy link
Contributor

Try to follow steps from this guide: #1225

@learnerr101
Copy link
Author

So these were the steps I performed:

Troubleshooting Spline Agent:
I was able to see the logs pertaining to the lineage.
Attaching the log file for your reference
Log.txt
Troubleshooting Arango DB:
image

Troubleshooting Spline Server:
image

Troubleshooting Spline UI:
image

Still I'm getting the same error for some reason.

@cerveada
Copy link
Contributor

In your config you use port 9090 but on the picture from server i see port 8080

@learnerr101
Copy link
Author

Yeah, I used commands with both the configs.

With 9090:
spark-submit --packages za.co.absa.spline.agent.spark:spark-2.4-spline-agent-bundle_2.12:0.5.2 --conf "spark.sql.queryExecutionListeners=za.co.absa.spline.harvester.listener.SplineQueryExecutionListener" --conf "spark.spline.producer.url=http://localhost:9090/producer" AB_2.py

With 8080:
spark-submit --packages za.co.absa.spline.agent.spark:spark-2.4-spline-agent-bundle_2.12:0.5.2 --conf "spark.sql.queryExecutionListeners=za.co.absa.spline.harvester.listener.SplineQueryExecutionListener" --conf "spark.spline.producer.url=http://localhost:8080/producer" AB_2.py

In both the scenarios I get the same error.

@cerveada
Copy link
Contributor

In the log I see output from LoggingLineageDispatcher so that is working. Now switch to http dispatcher.

  • You should use the url under Producer API - Base URL that you see in server page.
  • I would use the latest release version of the agent: 2.1.0
  • make sure the connection from agent to server is possible and nothing is blocking it

@learnerr101
Copy link
Author

learnerr101 commented Jul 25, 2024

Yeah I tried the following command:

_spark-submit  
--packages za.co.absa.spline.agent.spark:spark-3.0-spline-agent-bundle_2.12:2.1.0  
--conf spark.sql.queryExecutionListeners=za.co.absa.spline.harvester.listener.SplineQueryExecutionListener 
--conf spark.spline.lineageDispatcher=http  
--conf spark.spline.lineageDispatcher.http.producer.url=http://localhost:8080/producer  
AB_2.py_

So the config as of now is:
Spark 3.0.0
Scala 2.12
Spline : 2.1.0

But error still prevails!

@cerveada
Copy link
Contributor

If everything else works, it must be some networking issue. Are you sure that the server is reachable? Is the spark running locally on the same machine as the severe? If spark is somewhere else, localhost will not work.

Also, if a server or agent is running in the docker container, it may block the connection.

@wajda
Copy link
Contributor

wajda commented Jul 25, 2024

I'm converting this issue into a discussion as the issue is clearly on the user's side.

@AbsaOSS AbsaOSS locked and limited conversation to collaborators Jul 25, 2024
@wajda wajda converted this issue into discussion #1355 Jul 25, 2024
@github-project-automation github-project-automation bot moved this from New to Done in Spline Jul 25, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

3 participants