-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-49695][SQL] Postgres fix xor push-down #48144
base: master
Are you sure you want to change the base?
Conversation
Can we add some tests? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if you add an integration test, for instance here *PostgresExpressionPushdownSuite*
I wanted to add tests, but didn't know where to put them... Don't know how oss tests postgre. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @andrej-db . +1 for adding a test case definitely.
cc @huaxingao , too.
Added the test, let me know if this is in order. |
@@ -986,4 +986,13 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu | |||
test("scan with filter push-down with date time functions") { | |||
testDatetime(s"$catalogAndNamespace.${caseConvert("datetime")}") | |||
} | |||
|
|||
test("xor operator push-down") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can do explain formatted and check whether string contains "id" # 3
, that will add a little bit of robustness.
Another thing we can do is to make unit test, and just invoke compilation of XOR expression and check whether col # constant
is result of compilation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
neat, I like it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving latest iteration (with tests)
PostgresIntegrationSuite: add test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, fix your test:
[info] - SPARK-49695: Postgres fix xor push-down *** FAILED *** (10 milliseconds)
[info] org.apache.spark.sql.catalyst.ExtendedAnalysisException: [TABLE_OR_VIEW_NOT_FOUND] The table or view `bar` cannot be found. Verify the spelling and correctness of the schema and catalog.
|
||
override def compileExpression(expr: Expression): Option[String] = { | ||
val builder = new PostgresSQLBuilder() | ||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps its better to only override visitBinaryArithmetics?
We have similar problem with some of the functions and we use dialectFunctionName
to translate from Spark to local dialect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 here, let's override last possible method in chain of execution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider refactoring this
...integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/PostgresIntegrationSuite.scala
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but please make test more assertive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrej-db Could you fix builds and retrigger intergration tests:
[error] /home/runner/work/spark/spark/connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/PostgresIntegrationSuite.scala:237:25: not found: type Filter
[error] plan.isInstanceOf[Filter]
[error] ^
[error] one error found
What changes were proposed in this pull request?
This PR fixes the pushdown of ^ operator (XOR operator) for Postgres. Those two databases use this as exponent, rather then bitwise xor.
Fix is consisted of overriding the SQLExpressionBuilder to replace the '^' character with '#'.
Why are the changes needed?
Result is incorrect.
Does this PR introduce any user-facing change?
Yes. The user will now have a proper translation of the ^ operator.
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
No.