Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
There are some cases where you would like to pass additional data to a UDF that are not expressions. For example, suppose you wrote a UDF that did some kind of lookup and the function had a signature like
def my_lookup(column: Expr, lookup_values: dict[str, str]) -> Expr
It would be convenient to allow for passing in arbitrary data.
Describe the solution you'd like
Most likely this would mean a change to PyScalarUDF::__call__
and I could see changing the function definition to something like fn __call__(&self, args: Vec<PyAny>) -> PyResult<PyExpr>
. It's not immediately obvious to me how to set additional parameters, but it may mean switching from SimpleScalarUDF
to a different ScalarUDFImpl
to carry the additional data.
Describe alternatives you've considered
Right now you would need to do one of these approaches
- create a class that you can set these values on and then use the
__call__
function as your UDF - take advantage of setting the variable scope for
lookup_values
- Pass the variables as literals
Additional context