Skip to content

[EPIC] Complete datafusion-spark Spark Compatible Functions #15914

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 of 7 tasks
alamb opened this issue May 1, 2025 · 0 comments
Open
2 of 7 tasks

[EPIC] Complete datafusion-spark Spark Compatible Functions #15914

alamb opened this issue May 1, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented May 1, 2025

Is your feature request related to a problem or challenge?

Many DataFusion users are using DataFusion to execution workloads originally developed for Apache Spark. Examples include

They often do this for superior performance

  • Part of running Spark workloads is emulating Spark sematics
  • Emulating Spark semantics requires (among other things) functions compatible with Spark (which differs in semantics to the functions included in DataFusion)

Several projects are in the process of implementing Spark compatible function libraries using DataFusion's extension APIs. However. we concluded in #5600 that we could join forces and maintain a spark compatible funciton library in the core datafusion repo. @shehabgamin has implemented the first step in #15168 🙏

Describe the solution you'd like

This ticket tracks "completing" the spark function library started in #15168

Describe alternatives you've considered

Related Issues

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant