Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRILL-8393: Allow parameters to be passed to headers through SQL in WHERE clause #2747

Merged
merged 21 commits into from
Aug 3, 2023

Conversation

LYCJeff
Copy link
Contributor

@LYCJeff LYCJeff commented Feb 3, 2023

DRILL-8393: Allow parameters to be passed to headers through SQL in WHERE clause

Description

Allow parameters to be passed to headers through SQL in WHERE clause. Use the params configuration item to control what parameters are allowed in, passing them into the body and header depending on the prefix.

Config:
{ "url": "https://api.sunrise-sunset.org/json", "requireTail": false, "params": ["body.lat", "body.lng", "body.date", "header.header1"], "parameterLocation": "json_body" }

SQL Query:

SELECT * FROM api.sunrise
WHERE `body.lat` = 36.7201600
AND `body.lng` = -4.4203400
AND `body.date` = '2019-10-02'
AND `header.header1` = 'value1';

Post body:
{ "lat": 36.7201600, "lng": -4.4203400, "date": "2019-10-02"}

Headers:
{ "header1": "value1", ……}

Documentation

The params configuration mode and SQL parameter passing mode need to be described in the document.

Testing

The params configuration and SQL in the original unit test have been changed and have passed this part of the unit test.

in WHERE clause. Use the params configuration item to
control what parameters are allowed in, passing them
into the body and header depending on the prefix.
@cgivre cgivre added enhancement PRs that add a new functionality to Drill doc-impacting PRs that affect the documentation labels Feb 7, 2023
@cgivre cgivre added the breaking Includes changes that can cause an upgrade from the stable release to break label Feb 13, 2023
@LYCJeff
Copy link
Contributor Author

LYCJeff commented Feb 13, 2023

@cgivre This is basically all I have to change, I'm not sure if this change is appropriate and what else I have to do. I'd love to hear your opinion.

@LYCJeff LYCJeff marked this pull request as draft February 14, 2023 00:10
@LYCJeff LYCJeff marked this pull request as ready for review February 14, 2023 00:12
@LYCJeff LYCJeff marked this pull request as draft February 15, 2023 14:59
@LYCJeff LYCJeff marked this pull request as ready for review February 15, 2023 15:00
@cgivre
Copy link
Contributor

cgivre commented Feb 28, 2023

@LYCJeff
I really like the functionality here, but I am concerned that this is a breaking change and will affect existing Drill users. Also, it adds effectively new syntax to the SQL queries.

@LYCJeff
Copy link
Contributor Author

LYCJeff commented Mar 1, 2023

@LYCJeff I really like the functionality here, but I am concerned that this is a breaking change and will affect existing Drill users. Also, it adds effectively new syntax to the SQL queries.

@cgivre At this point, I can pass the unprefixed parameters in their place by default, the way they were. This minimizes the impact on existing users, except in the following cases. For example, the argument that the user passed into the request body was called header.xxx, but now needs to be rewritten as body.header.xxx, otherwise the argument will be passed into the request header.

In addition, a problem that had been fixed would reappear. The argument that is passed to the url path is also passed to the end of the url, which has been clearly distinguished since I changed it.

Let me know if you think this is more friendly to existing users, then I'll move in this direction.

@jnturton
Copy link
Contributor

jnturton commented Mar 2, 2023

Two ideas

  1. Since we won't backport this PR and it will only go out in the next major release, some breakage inside a plugin is probably something that can be swallowed.
  2. If it is still desired to preserve the ability to use the existing syntax in Drill 1.22 and beyond then a storage config option like "useLegacyRequestParmSyntax": true could be added for users who want it.

@LYCJeff
Copy link
Contributor Author

LYCJeff commented Mar 2, 2023

Two ideas

  1. Since we won't backport this PR and it will only go out in the next major release, some breakage inside a plugin is probably something that can be swallowed.
  2. If it is still desired to preserve the ability to use the existing syntax in Drill 1.22 and beyond then a storage config option like "useLegacyRequestParmSyntax": true could be added for users who want it.

@jnturton @cgivre That's a good idea without confusing old and new syntax, although it requires existing users to make small additions to the configuration. If it is acceptable to you, I will take some time to add a configuration item in the near future.

@LYCJeff LYCJeff marked this pull request as draft March 8, 2023 01:08
@LYCJeff LYCJeff marked this pull request as ready for review March 8, 2023 01:08
@cgivre
Copy link
Contributor

cgivre commented Mar 16, 2023

@LYCJeff Thanks for making these changes. I have a few questions:

  1. Are you certain that these filters are in fact being pushed down as intended?
  2. I'm really concerned about what would happen if a user aliased a data source as header or tail.

IE:

SELECT ... 
FROM api.foo 
INNER JOIN dfs.`tail.csv` AS tail
ON tail.id = foo.id
WHERE tail.name = 'something'

Do we know how this would be interpreted?

@LYCJeff
Copy link
Contributor Author

LYCJeff commented Mar 16, 2023

@LYCJeff Thanks for making these changes. I have a few questions:

  1. Are you certain that these filters are in fact being pushed down as intended?
  2. I'm really concerned about what would happen if a user aliased a data source as header or tail.

IE:

SELECT ... 
FROM api.foo 
INNER JOIN dfs.`tail.csv` AS tail
ON tail.id = foo.id
WHERE tail.name = 'something'

Do we know how this would be interpreted?

@cgivre Well, we actually need to recognize tail.xxx as a whole parameter name, so we need to use back quotes. Only then can it be pushed normally, so these prefixes are not confused with data source aliases.

If the name in your example above is an argument to the foo api, it should be written as follows.

SELECT ...
FROM api.foo
INNER JOIN dfs.`tail.csv` AS tail
ON tail.id=foo.id
WHERE `tail.name` = 'something'

@cgivre
Copy link
Contributor

cgivre commented Apr 18, 2023

@LYCJeff
Thank you for submitting this and I'm sorry the review is taking so long. This is a potentially very breaking change and that's why I've had to give this so much thought. I've also been quite busy. Ok...

Let's get this merged.
Here are my final requests:

  1. Please rebase on current master
  2. Can we make the default behavior what Drill currently does and change the boolean parameter to something like enablePostHeadersInWhere or something that uses the word enable?
  3. Make the default behavior to use the current implementation.

Once that's done, I'm good with merging.

contrib/storage-http/README.md Outdated Show resolved Hide resolved
contrib/storage-http/README.md Show resolved Hide resolved
LYCJeff and others added 2 commits April 19, 2023 09:03
…nge the boolean to enableEnhancedParamSyntax.

Make the default behavior what Drill currently does.
Modify the document.
@LYCJeff LYCJeff requested a review from cgivre April 24, 2023 08:01
Copy link
Contributor

@cgivre cgivre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, This is looking good. Can you please add a few unit tests, or simply revert a few where the enhanced mode is set to false? Once that's done we're good to go.

@LYCJeff
Copy link
Contributor Author

LYCJeff commented May 17, 2023

Ok, This is looking good. Can you please add a few unit tests, or simply revert a few where the enhanced mode is set to false? Once that's done we're good to go.

I got a COVID. I'll fix it after I recover.

@LYCJeff LYCJeff requested a review from cgivre June 30, 2023 03:37
@LYCJeff
Copy link
Contributor Author

LYCJeff commented Aug 1, 2023

@cgivre Is there anything else I should do for this PR?

@cgivre
Copy link
Contributor

cgivre commented Aug 1, 2023

@LYCJeff Could you please rebase on current master. Other than that, LGTM +1.

Sorry this took so long.
Thank you very much for this contribution!

Copy link
Contributor

@cgivre cgivre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

@LYCJeff
Copy link
Contributor Author

LYCJeff commented Aug 2, 2023

@LYCJeff Could you please rebase on current master. Other than that, LGTM +1.

Sorry this took so long. Thank you very much for this contribution!

@cgivre I have updated, thanks for your review.

@cgivre cgivre merged commit 4e3992b into apache:master Aug 3, 2023
13 of 15 checks passed
cgivre pushed a commit to cgivre/drill that referenced this pull request Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Includes changes that can cause an upgrade from the stable release to break doc-impacting PRs that affect the documentation enhancement PRs that add a new functionality to Drill
Projects
None yet
3 participants