Add `#[recursive]` #1522

blaginin · 2024-11-14T20:49:21Z

Closes #984, related to apache/datafusion#9375 (comment)

Todo:

Test the performance overhead on large queries. If it's too high, move it to a separate feature to allow opting out

blaginin · 2024-11-14T21:08:31Z

I was wondering if we should remove RecursionCounter with this PR. In my opinion, we shouldn't, because the ability to limit max recursion might be useful for some users still

blaginin · 2024-11-15T19:08:24Z

It seems this PR doesn't have any significant performance impact.

critcmp main recursive                        
group                                                    main                                   recursive
-----                                                    ----                                   ---------
sqlparser-rs parsing benchmark/format_large_statement    1.00   309.1±12.72µs        ? ?/sec    1.01    312.4±7.45µs        ? ?/sec
sqlparser-rs parsing benchmark/parse_large_statement     1.02      6.3±0.35ms        ? ?/sec    1.00      6.2±0.40ms        ? ?/sec
sqlparser-rs parsing benchmark/parse_sql_large_query     1.00      6.1±0.23ms        ? ?/sec  
sqlparser-rs parsing benchmark/sqlparser::select         1.02  1893.8±82.19ns        ? ?/sec    1.00  1855.1±25.08ns        ? ?/sec
sqlparser-rs parsing benchmark/sqlparser::with_select    1.03     11.7±1.20µs        ? ?/sec    1.00     11.4±0.28µs        ? ?/sec

blaginin · 2024-11-15T19:28:30Z

FYI, I marked this ready for review. @peter-toth @Eason0729, if you guys want to take a look 🌻

iffyio · 2024-11-16T06:35:00Z

@blaginin thanks for looking to fix this!

Currently the preference is to avoid a third-party dependency for this issue, ideally fixing up the parser behavior instead to properly handle deeply nested input. See comment here for a bit more context on rationale
From a quick look at recursive it seems to be a procmacro on top of the stacker library so that the same considerations should apply I imagine.

Eason0729

Thanks @blaginin. LGTM except for some number in unit test.

I left some comments, and any number above 40000 seems to overflow stack without #[recursive].

Eason0729 · 2024-11-16T08:02:16Z

sqlparser_bench/benches/sqlparser_bench.rs

@@ -42,6 +42,46 @@ fn basic_queries(c: &mut Criterion) {
    group.bench_function("sqlparser::with_select", |b| {
        b.iter(|| Parser::parse_sql(&dialect, with_query));
    });
+


For large_statement, making separated test would make differentiating potential(future) regression easier.
It's a suggestion(fine to leave it as it is).

Not sure I understood your comment, sorry. I added tests for parsing large statements in tests/sqlparser_common.rs. Do you think we should test something else?

I thinks it would be better to add as separated test.

tests/sqlparser_common.rs

src/ast/visitor.rs

peter-toth · 2024-11-16T09:17:57Z

Currently the preference is to avoid a third-party dependency for this issue, ideally fixing up the parser behavior instead to properly handle deeply nested input. See comment here for a bit more context on rationale From a quick look at recursive it seems to be a procmacro on top of the stacker library so that the same considerations should apply I imagine.

@iffyio, we had done some research on stacker / recursive in apache/datafusion#13310 to verify concerns before we added it to datafusion: apache/datafusion#13310 (review) / apache/datafusion#13177 (comment)

Eason0729 · 2024-11-16T10:46:45Z

Currently the preference is to avoid a third-party dependency for this issue, ideally fixing up the parser behavior instead to properly handle deeply nested input. See comment here for a bit more context on rationale From a quick look at recursive it seems to be a procmacro on top of the stacker library so that the same considerations should apply I imagine.

@iffyio, we had done some research on stacker / recursive in apache/datafusion#13310 to verify concerns before we added it to datafusion: apache/datafusion#13310 (review) / apache/datafusion#13177 (comment)

Sorry for missing that context when reviewing. So we would like to avoid using recursive for stablility, maybe try...

vender the code from recursive
use stacker directly, like fix: add stacker and maybe_grow on recursion guard #1468

blaginin · 2024-11-16T13:18:43Z

Thanks for the review!! 🙂

Sorry for missing that context when reviewing. So we would like to avoid using recursive for stablility, maybe try...

We ended up using #[recursive] in Datafusion, as Peter highlighted. Feels like it’s good to stay consistent across projects?

iffyio · 2024-11-17T05:58:24Z

Ah I see, thanks for the context @peter-toth!

cc @alamb for overall thoughts on adding this dependency to sqlparser?

alamb · 2024-11-24T12:05:41Z

Ah I see, thanks for the context @peter-toth!

cc @alamb for overall thoughts on adding this dependency to sqlparser?

While adding new dependencies in general ls 🤮 I don't think there is any viable alternative in this case

We have tried to avoid doing something like stacker for several years with sqlparser and datafusion, but I feel like we have only been able to band aid over the problem, not fix the problem once and for all.

I am hopeful that if we adopt this particular crate we won't have to worry about it again 🤞

alamb · 2024-11-24T12:07:48Z

I think this PR needs a bit more documentation and we shoudl figure out how to rationalize with the existing recursion_limit argument.

https://docs.rs/sqlparser/latest/sqlparser/parser/struct.Parser.html#method.with_recursion_limit

# Conflicts: # tests/sqlparser_common.rs

blaginin · 2024-11-26T19:19:24Z

@alamb, I feel like recursion_limit should stay because:

recursive doesn't solve the issue when std isn’t enabled.
Removing it would be a breaking change.
It can be a useful feature even with recursive protection, for example, if lib users have additional constraints on the parsed query.

I added notes in the methods I touched; should be better now ✋

Eason0729

LGTM

Eason0729 · 2024-11-28T17:02:26Z

It seems like we reached the decision to add recursive instead of using underlying dependency(stacker).

alamb

I think it looks good -- any other thoughts @iffyio

Thank you @blaginin and @Eason0729 for pushing this along

blaginin · 2024-12-02T18:23:28Z

Just for transparency, there's apache/datafusion#13513 raised in Datafusion but I believe it shouldn't be the reason not to merge this one (happy to be challenged)

alamb · 2024-12-02T18:30:33Z

Just for transparency, there's apache/datafusion#13513 raised in Datafusion but I believe it shouldn't be the reason not to merge this one (happy to be challenged)

Maybe we could make it an optional dependency 🤔

blaginin · 2024-12-02T18:32:02Z

nice idea actually! will do

iffyio

LGTM!

README.md

Co-authored-by: Ifeanyi Ubah <[email protected]>

# Conflicts: # tests/sqlparser_common.rs

blaginin · 2024-12-06T20:11:37Z

resolved conflicts, should be good to merge now 🤗

alamb · 2024-12-11T22:36:04Z

Given the potential for unintended side effects with this change, I think we should merge it in asap after we have released

Release sqlparser-rs version 0.53.0 / sqlparser_derive 0.3.0 #1517

Eason0729 · 2024-12-17T15:45:57Z

Should we merge this? or something is missing in this PR.
I have no write access, so cc @alamb .

# Conflicts: # tests/sqlparser_common.rs

blaginin · 2024-12-17T20:57:11Z

hey, based on the previous comment I think we want to make a release first 🙂

Eason0729 · 2024-12-18T01:46:58Z

hey, based on the previous comment I think we want to make a release first 🙂

Thanks.
I am sorry for missing that! 👀

alamb · 2024-12-19T19:17:12Z

Sorry -- the reason I haven't previously merged this is exactly to meger it after release to give it enough "bake time" . I am glad we did wait, actually, as using this macro has caused trouble downstream in datafusion

see Making the recursive dependency an optional feature datafusion#13766
Now that we have released Release sqlparser-rs version 0.53.0 / sqlparser_derive 0.3.0 #1517 (finally made it yesterday)

I think we are good to go

Thank you again @blaginin and @Eason0729 for your contributions and patience

lovasoa · 2025-02-18T09:48:24Z

Hi @alamb , @iffyio ! Just wanted to report that I just received a crash report that seems to come from here: sqlpage/SQLPage#814

alamb · 2025-02-18T19:50:01Z

Hi @alamb , @iffyio ! Just wanted to report that I just received a crash report that seems to come from here: sqlpage/SQLPage#814

Thaks @lovasoa

Looks like you have also reported a bug to stacker:

Assertion error: when pthread returns an error, this library crashes the entire program rust-lang/stacker#115

I am glad there is a way to disable the stacker dependency in sqlparser

Is there anything else you think we should do here?

Thanks again

lovasoa · 2025-02-18T20:29:35Z

Yes, I discovered error handling is completely absent from stacker, and it just crashes the entire program with a cryptic error message when the underlying pthread library returns any error. I submitted a PR, but in the meantime, I wouldn't recommend including it by default in sqlparser, especially if the goal was to avoid crashes in the first place.

lovasoa · 2025-02-18T20:31:04Z

SQLPage does not have a huge install base, and it took just a few days before the first crash report.

alamb · 2025-02-19T11:04:02Z

SQLPage does not have a huge install base, and it took just a few days before the first crash report.

DataFusion does have a pretty large user base and we haven't gotten crash reports yet that I know of. I was somewhat worried about using stacker in datafusion too until @peter-toth pointed out that it was used by rustc itself which allayed my concerns.

it does seem like the usecase

glibc targets where procfs isn't mounted at /proc is

Is somewhat uncommon.

I don't really have a strong opinion one way or the other.

lovasoa · 2025-02-19T11:20:45Z

Yes, I initially thought that the problem was specific to their very restricted environment, but looking at the code in stacker, they crash the entire program on ANY error returned by any of the pthread functions used. And pthread functions can return an error code in a number of cases.

alamb · 2025-02-19T11:43:45Z

Interesting -- I haven't looked at the code or what functions are used (and thus under what circumstances such errors happen or how likely they are to occur)

peter-toth · 2025-02-19T12:17:13Z

I didn't notice the missing error handling either. Your rust-lang/stacker#116 seems like a nice improvement. But if it doesn't get accepted for some reason, then probably we could handle errors in DF. Maybe adjust the recursive macro to run some tests before using stacker...

alamb · 2025-02-23T11:46:48Z

I didn't notice the missing error handling either. Your rust-lang/stacker#116 seems like a nice improvement. But if it doesn't get accepted for some reason, then probably we could handle errors in DF. Maybe adjust the recursive macro to run some tests before using stacker...

Looks like improve error handling rust-lang/stacker#116 was accepted 🎉

blaginin mentioned this pull request Nov 14, 2024

Detect stack overflow and reduce stack usage on debug build #1465

Closed

blaginin force-pushed the add-recursive branch 2 times, most recently from 458f748 to a4a5794 Compare November 14, 2024 20:59

Add #[recursive]

73891ee

blaginin force-pushed the add-recursive branch from a4a5794 to 73891ee Compare November 14, 2024 21:02

Add larger benchmarks

39f710d

blaginin added 2 commits November 15, 2024 19:10

Rename

90843ad

Cargo fmt

3af997b

blaginin marked this pull request as ready for review November 15, 2024 19:22

Eason0729 reviewed Nov 16, 2024

View reviewed changes

blaginin added 4 commits November 26, 2024 19:11

Merge branch 'main' into add-recursive

1104f25

# Conflicts: # tests/sqlparser_common.rs

Cargo fmt

82b26f6

Add notes

c9eef87

Add a note on with_recursion_limit

913a291

Eason0729 approved these changes Nov 28, 2024

View reviewed changes

alamb approved these changes Dec 2, 2024

View reviewed changes

blaginin added 2 commits December 2, 2024 18:32

Merge branch 'main' into add-recursive

78effdf

Move to a separate feature

34acb8b

iffyio approved these changes Dec 3, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

blaginin and others added 3 commits December 3, 2024 21:45

Update README.md

5e3838b

Co-authored-by: Ifeanyi Ubah <[email protected]>

Merge branch 'main' into add-recursive

505c271

# Conflicts: # tests/sqlparser_common.rs

Merge remote-tracking branch 'origin/add-recursive' into add-recursive

b91baec

alamb mentioned this pull request Dec 14, 2024

Making the recursive dependency an optional feature apache/datafusion#13766

Closed

Merge branch 'main' into add-recursive

41c2424

# Conflicts: # tests/sqlparser_common.rs

alamb merged commit 84e82e6 into apache:main Dec 19, 2024
8 checks passed

alamb mentioned this pull request Jan 16, 2025

Add recursion limit configuration to DFParser apache/datafusion#14095

Closed

This was referenced Feb 18, 2025

Remove dependency to stacker, which crashes sqlpage sqlpage/SQLPage#815

Merged

Assertion error: when pthread returns an error, this library crashes the entire program rust-lang/stacker#115

Closed

chenkovsky mentioned this pull request Mar 8, 2025

fix: nested window function apache/datafusion#15033

Merged

Add #[recursive] #1522

Add #[recursive] #1522

Uh oh!

Conversation

blaginin commented Nov 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blaginin commented Nov 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blaginin commented Nov 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blaginin commented Nov 15, 2024

Uh oh!

iffyio commented Nov 16, 2024

Uh oh!

Eason0729 left a comment

Choose a reason for hiding this comment

Uh oh!

Eason0729 Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

blaginin Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

Eason0729 Nov 17, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

peter-toth commented Nov 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Eason0729 commented Nov 16, 2024

Uh oh!

blaginin commented Nov 16, 2024

Uh oh!

iffyio commented Nov 17, 2024

Uh oh!

alamb commented Nov 24, 2024

Uh oh!

alamb commented Nov 24, 2024

Uh oh!

blaginin commented Nov 26, 2024

Uh oh!

Eason0729 left a comment

Choose a reason for hiding this comment

Uh oh!

Eason0729 commented Nov 28, 2024

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

blaginin commented Dec 2, 2024

Uh oh!

alamb commented Dec 2, 2024

Uh oh!

blaginin commented Dec 2, 2024

Uh oh!

iffyio left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

blaginin commented Dec 6, 2024

Uh oh!

alamb commented Dec 11, 2024

Uh oh!

Eason0729 commented Dec 17, 2024

Uh oh!

blaginin commented Dec 17, 2024

Uh oh!

Eason0729 commented Dec 18, 2024

Uh oh!

alamb commented Dec 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

lovasoa commented Feb 18, 2025

Uh oh!

Add `#[recursive]` #1522

Add `#[recursive]` #1522

blaginin commented Nov 14, 2024 •

edited

Loading

blaginin commented Nov 14, 2024 •

edited

Loading

blaginin commented Nov 15, 2024 •

edited

Loading

peter-toth commented Nov 16, 2024 •

edited

Loading

alamb commented Dec 19, 2024 •

edited

Loading

peter-toth commented Feb 19, 2025 •

edited

Loading