-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Column renaming for the duration of a function call #4718
Comments
I think that a functional language would solve it like this: let take_smallest = func n key_getter tbl -> (
from tbl
sort (key_getter this)
take n
)
from tracks
group {album_id} (
take_smallest 3 (func this -> this.milliseconds)
) This currently fails miserably as functions are not really first-class citizens of PRQL. |
What about just passing the column as an argument to the function? This seems to work fine for me...
Compiled SQL: WITH table_0 AS (
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY album_id
ORDER BY
bytes
) AS _expr_0
FROM
tracks
)
SELECT
*
FROM
table_0
WHERE
_expr_0 <= 3
-- Generated by PRQL compiler version:0.12.2 (https://prql-lang.org) Interestingly, if you omit the |
Thanks @kgutwin. I didn't know that could work. What does the Interestingly, I found that the My minimal working example is: let take_smallest = func n c tbl<relation> -> (
from tbl
sort c
take n
)
from tracks
group {album_id} (
take_smallest 3 bytes
) 🚀 |
This is very cool! There seem to be quite a few corner cases though. First a fun little example to play around with: let f = func n c tbl<relation> -> (from tbl | sort c | take n | select c)
let t = (from [{a=1},{a=2},{a=3}] | derive b=-a)
from t
f 1 a # works produces [
{"a": 1}
] We can also rename the column, including to let f = func n c tbl<relation> -> (from tbl | sort c | take n | select {c=c})
from t
f 1 b producing [
{"c": -3},
{"c": -2}
] So far, so good. However let f = func n c tbl<relation> -> (from tbl | sort c | take n | select {c, c=c})
from t
f 1 b produces [
{"c": -3, "c": -3}
] and SQL WITH table_0 AS (
SELECT
1 AS a
UNION
ALL
SELECT
2 AS a
UNION
ALL
SELECT
3 AS a
),
t AS (
SELECT
a,
- a AS b
FROM
table_0
)
SELECT
b AS c,
b AS c
FROM
t
ORDER BY
c
LIMIT
1
-- Generated by PRQL compiler version:0.12.2 (https://prql-lang.org) I think the expected output would be: [
{"b": -3, "c": -3}
] and SQL ...
SELECT
b,
b AS c
FROM
t
ORDER BY
c
LIMIT
1 |
I tried to see how far I can push this, for example with joins. Given the following simple setup, would it be possible to parameterise the relation and column being joined? let t = (from [{a=1},{a=2},{a=3}] | derive b=-a)
let u = (from t | select {a=a+1, b=b+1})
from t
join u (==b) Unfortunately none of the following worked for me: Simple: let j = func s<relation> c r<relation> -> (from r | join s (==c))
from t
j u a One level of indirection: let j = func s<relation> c r<relation> -> (
from r
derive {_i=c}
join (from s | derive {_i=c}) (==_i)
)
from t
j u a Explicit: let j = func s<relation> c r<relation> -> (
from r
derive {_i=c}
join (from s | derive {_i=c}) (this._i==that._i)
)
from t
j u a They all produce the following error (which is why I tried that last example):
|
Here's an interesting behavior (I was experimenting a bit with some of your examples): from t
select {b, d=b} produces: SELECT
b,
b AS d
FROM
t But: let g = func c tbl <relation> -> (
from tbl
select {c, d=c}
)
from t
g b produces: SELECT
b AS d,
b AS d
FROM
t That seems like it would be a bug. I would expect the second example to produce the same output as the first. |
Oh god, what have we created! The "passing column into the function" is not something that was designed purposefully - it's an artifact of how resolver is implemented, so it might change in the future. And also, it is not even fully correct as you two have found on many examples :D |
I'd argue in favor of keeping this sort of behavior (well, fixing the bugs and inconsistencies, of course). It feels really straightforward to pass a column reference to a user-created function, just like you pass column references to standard library functions. |
That's the thing: we don't have a clear idea what are references in PRQL; std functions are taking arguments by value (at least in my understanding). We'd need to formalize this notion of column references (that don't have a type), but are just names that the function is supposed to use. |
@snth has recently raised an issue that I remember we have already talked about, but cannot find now. I do think is a pain point and something we should do do better.
The text was updated successfully, but these errors were encountered: