Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement lenient parser #2129

Merged
merged 12 commits into from
Aug 8, 2023
Merged

implement lenient parser #2129

merged 12 commits into from
Aug 8, 2023

Conversation

trinity-1686a
Copy link
Contributor

fix #5

Most of the grammar part is done, however it's not reporting hints just yet (this isn't really required, but would be a nice addition, to help someone understand how their query was understood, and why the parser think it's somewhat wrong).

At the moment, only the part in query-grammar is done. More work is required to convert the AST to a query without failing. This part hasn't been explored yet, but should be before the ticket is considered fixed.

I'm sending this as a draft to get initial comments on the parser part

@codecov-commenter
Copy link

codecov-commenter commented Jul 19, 2023

Codecov Report

Merging #2129 (f7ed0b7) into main (ebc7812) will decrease coverage by 0.03%.
Report is 32 commits behind head on main.
The diff coverage is 93.66%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##             main    #2129      +/-   ##
==========================================
- Coverage   94.38%   94.35%   -0.03%     
==========================================
  Files         321      320       -1     
  Lines       60583    61527     +944     
==========================================
+ Hits        57179    58053     +874     
- Misses       3404     3474      +70     
Files Changed Coverage Δ
src/schema/json_object_options.rs 92.66% <ø> (ø)
src/schema/mod.rs 100.00% <ø> (ø)
query-grammar/src/lib.rs 57.14% <25.00%> (-42.86%) ⬇️
src/query/query_parser/query_parser.rs 92.82% <66.94%> (-2.61%) ⬇️
query-grammar/src/infallible.rs 92.99% <92.99%> (ø)
query-grammar/src/user_input_ast.rs 98.42% <94.73%> (+1.12%) ⬆️
query-grammar/src/query_grammar.rs 97.81% <97.33%> (-1.92%) ⬇️

... and 51 files with indirect coverage changes

@adamreichold
Copy link
Collaborator

The main question that came to mind reading this was whether the lenient parser should fully replace the existing parser. I think as an API between systems, the existing - let's say "strict" - parser is still useful. Of course, as a user interface, a lenient parser is very much preferable.

How do people think about having both parsers and choosing between them at compile time or via a runtime flag? I understand that maintaining two parsers is a significant commitment but I also feel that replacing the existing behaviour in a "flag day" type of change could trigger a certain amount of user backlash.

@trinity-1686a
Copy link
Contributor Author

the goal isn't to remove the strict parser. It was rewritten so both could share some code, but it will definitely be kept for the foreseeable future.

@adamreichold
Copy link
Collaborator

the goal isn't to remove the strict parser. It was rewritten so both could share some code, but it will definitely be kept for the foreseeable future.

Ah sorry, then I misunderstood the PR in my first reading and the concern does not apply at all. Thanks for clarifying!

Copy link
Collaborator

@fulmicoton fulmicoton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice approach.

But you really should put more effort in making your code readable.
Here are a bunch of concrete things you can do to improve this.

Have proper variable names

This is short and well contained, so i and o might be ok variable names.
I would have preferred input and output but I guess this can be argued.
i1, i2 however, is really not helpful.

Stop abusing iterator style.

For instance, if you do it to mutate stuff, that's probably a bad idea.
If you start using rare iterator functions, that's probably a bad idea too.
If you end up struggling with dealing with error handling, using more than one trick like transpose and collect::<Result<T,_>>, it is probably a bad idea.
If your code seems to suggest a variable is a vec but it is in fact an Option or a Result, this is a bad idea.

Consider using struct instead of tuple

type Error = (usize, String);
That kind of tuple makes code easy to write, but hard to understand.
You actually changed the semantics of the first argument, which is making things even worse.

If possible, I would have preferred two structs.
This however comes with a verbosity cost, so this may not be pertinent: I leave you judge of whether the change should be done or not.

Comment more

Something like
type Error = (usize, String);
is really hard for the reader.

Show types in your code when it is not easy to guess.

We sometimes have to read rust without type hints.
For instance in github or in an editor for which rust analyzer has not finished crunching a
freshly checked out branch.
It is useful to add explicit types here in strategic places.

@trinity-1686a trinity-1686a marked this pull request as ready for review August 7, 2023 15:09
@trinity-1686a trinity-1686a merged commit b92082b into main Aug 8, 2023
5 checks passed
@trinity-1686a trinity-1686a deleted the trinity--lenient-parser branch August 8, 2023 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a lenient mode to QueryParser
4 participants