Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive number of filters overflows the stack #11102

Closed
simonvandel opened this issue Jun 24, 2024 · 4 comments
Closed

Excessive number of filters overflows the stack #11102

simonvandel opened this issue Jun 24, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@simonvandel
Copy link
Contributor

Describe the bug

Consider the contrived query generated with this:

print('SELECT * FROM VALUES(1) WHERE ' + ' OR '.join(['column1 = ' + str(i) for i in range(10000)]))

It looks like this:

SELECT * FROM VALUES(1) WHERE column1 = 0 OR column1 = 1 OR column1 = 2 ...

Such a query overflows the stack.

To Reproduce

python3 -c "print('SELECT * FROM VALUES(1) WHERE ' + ' OR '.join(['column1 = ' + str(i) for i in range(10000)]))" > /tmp/query.sql
datafusion-cli -f /tmp/query.sql
# output below
DataFusion CLI v39.0.0

thread 'main' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)

Expected behavior

I expected that the query ran successfully, or provided an error.

Additional context

LLDB shows that the culprit might be that map_children of Expr is recursive.

lldb datafusion-cli
settings set target.run-args -f /tmp/query.sql
run
bt
# shows deep stack trace like this
  * frame #0: 0x0000555557159771 datafusion-cli`datafusion_expr::tree_node::_$LT$impl$u20$datafusion_common..tree_node..TreeNode$u20$for$u20$datafusion_expr..expr..Expr$GT$::map_children::hc2fc8c2c6eec78b5 + 17
    frame #1: 0x00005555572f4b46 datafusion-cli`datafusion_expr::tree_node::transform_box::h4a3722c597863a90 + 70
    frame #2: 0x0000555557159acc datafusion-cli`datafusion_expr::tree_node::_$LT$impl$u20$datafusion_common..tree_node..TreeNode$u20$for$u20$datafusion_expr..expr..Expr$GT$::map_children::hc2fc8c2c6eec78b5 + 876
    frame #3: 0x00005555572f4b46 datafusion-cli`datafusion_expr::tree_node::transform_box::h4a3722c597863a90 + 70
    frame #4: 0x0000555557159acc datafusion-cli`datafusion_expr::tree_node::_$LT$impl$u20$datafusion_common..tree_node..TreeNode$u20$for$u20$datafusion_expr..expr..Expr$GT$::map_children::hc2fc8c2c6eec78b5 + 876
    frame #5: 0x00005555572f4b46 datafusion-cli`datafusion_expr::tree_node::transform_box::h4a3722c597863a90 + 70
    frame #6: 0x0000555557159acc datafusion-cli`datafusion_expr::tree_node::_$LT$impl$u20$datafusion_common..tree_node..TreeNode$u20$for$u20$datafusion_expr..expr..Expr$GT$::map_children::hc2fc8c2c6eec78b5 + 876
    frame #7: 0x00005555572f4b46 datafusion-cli`datafusion_expr::tree_node::transform_box::h4a3722c597863a90 + 70
    frame #8: 0x0000555557159acc datafusion-cli`datafusion_expr::tree_node::_$LT$impl$u20$datafusion_common..tree_node..TreeNode$u20$for$u20$datafusion_expr..expr..Expr$GT$::map_children::hc2fc8c2c6eec78b5 + 876
@simonvandel simonvandel added the bug Something isn't working label Jun 24, 2024
@jayzhan211
Copy link
Contributor

I think it is duplicated from #9375

Maybe I could take a look, it looks quite interesting.

@simonvandel
Copy link
Contributor Author

Ah yeah, absolutely! Sorry for not finding that issue. I'll close this issue then.

@simonvandel simonvandel closed this as not planned Won't fix, can't repro, duplicate, stale Jun 25, 2024
@jayzhan211
Copy link
Contributor

@simonvandel Do you have this kind of query in practice (real-world workload) or you just find out this query fails?

@simonvandel
Copy link
Contributor Author

simonvandel commented Jun 26, 2024

@simonvandel Do you have this kind of query in practice (real-world workload) or you just find out this query fails?

@jayzhan211 The query is extracted/based from a real-world workload where many filters are possible.
I have worked around this for now, by making a custom UDF that performs the filters. It would be cool if that wasn't needed, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants