You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
There are some cases where you would like to pass additional data to a UDF that are not expressions. For example, suppose you wrote a UDF that did some kind of lookup and the function had a signature like
It would be convenient to allow for passing in arbitrary data.
Describe the solution you'd like
Most likely this would mean a change to PyScalarUDF::__call__ and I could see changing the function definition to something like fn __call__(&self, args: Vec<PyAny>) -> PyResult<PyExpr>. It's not immediately obvious to me how to set additional parameters, but it may mean switching from SimpleScalarUDF to a different ScalarUDFImpl to carry the additional data.
Describe alternatives you've considered
Right now you would need to do one of these approaches
create a class that you can set these values on and then use the __call__ function as your UDF
take advantage of setting the variable scope for lookup_values
Pass the variables as literals
Additional context
The text was updated successfully, but these errors were encountered:
After some initial investigation, the way the window functions are using an instance has a problem in that when the instance is reused it's carrying over the past state. I bet that's why the aggregate functions were set up to use a class type and then instantiate them in to_rust_accumulator. I'm trying to understand why this does work because it seems like the only place we instantiate is during the create_udaf however we're clearly getting two calls to __init__ in the unit test when I add a second aggregation in the same test using the same udf. This requires further investigation.
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
There are some cases where you would like to pass additional data to a UDF that are not expressions. For example, suppose you wrote a UDF that did some kind of lookup and the function had a signature like
def my_lookup(column: Expr, lookup_values: dict[str, str]) -> Expr
It would be convenient to allow for passing in arbitrary data.
Describe the solution you'd like
Most likely this would mean a change to
PyScalarUDF::__call__
and I could see changing the function definition to something likefn __call__(&self, args: Vec<PyAny>) -> PyResult<PyExpr>
. It's not immediately obvious to me how to set additional parameters, but it may mean switching fromSimpleScalarUDF
to a differentScalarUDFImpl
to carry the additional data.Describe alternatives you've considered
Right now you would need to do one of these approaches
__call__
function as your UDFlookup_values
Additional context
The text was updated successfully, but these errors were encountered: