Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table name and transform name collisions #3271

Open
2 tasks done
FinnRG opened this issue Aug 6, 2023 · 8 comments
Open
2 tasks done

Table name and transform name collisions #3271

FinnRG opened this issue Aug 6, 2023 · 8 comments
Labels
language-design Changes to PRQL-the-language needs-discussion Undecided dilemma

Comments

@FinnRG
Copy link
Contributor

FinnRG commented Aug 6, 2023

What happened?

I have a table named group, which I tried to query. I tried using backticks (as suggested in the docs: https://prql-lang.org/book/reference/stdlib/transforms/from.html), which doesn't seem to work.

PRQL input

from `group`
filter id == $1

SQL output

No output

Expected SQL output

No response

MVCE confirmation

  • Minimal example
  • New issue

Anything else?

Exact error:

   ╭─[:2:1]
   │
 2 │ filter id == $1
   │ ───────┬───────  
   │        ╰───────── main expected type `relation`, but found type `infer -> infer`
   │ 
   │ Help: Have you forgotten an argument to function main?
   │ 
   │ Note: Type `relation` expands to `[tuple_of_scalars]`
───╯
@FinnRG FinnRG added the bug Invalid compiler output or panic label Aug 6, 2023
@max-sixty
Copy link
Member

Thanks @FinnRG . Yes, repro-ed. And a particularly bad error message...

As a workaround in the meantime, this works:

from s"SELECT * from group"
WITH table_0 AS (
  SELECT
    *
  from
    group
)
SELECT
  *
FROM
  table_0

-- Generated by PRQL compiler version:0.9.3 (https://prql-lang.org)

@aljazerzen
Copy link
Member

This is expected behavior. An unfortunate one, but a compound of how we do name resolution and table inference.

So the problem is that identifier group matches std.group (the group function) instead of default_db.group (the table declaration).

The solution is this:

from default_db.group
filter id == $1

... except that now, this will compile to SELECT * FROM default_db.group ..., because this will infer that we are referring to default_db.default_db.group.

The real solution is:

default_db.group
filter id == $1

... which bypasses from function and its behavior of defaulting to default_db module.

@max-sixty
Copy link
Member

Completely abstract of the current implementation, I do think that's a bit unfriendly / inconsistent.

I recognize we do need to make some tradeoffs because of PRQL's use of "bare words" — there'll always be some ambiguity. But (thinking on the fly here, not super confident):

  • there would be a robust way which machines could write — e.g. in column selection, we can always have this.foo, then this.time will resolve to a column even when time resolves to a type.
  • the robust way would follow from the less robust way. My objection to the current implementation is that if a machine wanted to generate PRQL, it would generate default_db.foo rather than from foo, which then doesn't look much like PRQL.

How about we extend the this paradigm to table selection too? So from this.group gets the group table?

We'd need to discriminate between scopes:

  • where there's an implicit relation — e.g. within a derive, where this refers to a relation
  • where the top scope is default_db — e.g. within a from, where this refers to the default db

(other languages have slightly complicated rules around this, e.g. jq and go templates use $ to jump scope levels IIRC)


See also: #2155, #1619

@FinnRG
Copy link
Contributor Author

FinnRG commented Aug 7, 2023

Could I open a PR to add the default_db.group solution to the documentation, or is this a non-standard/implementation workaround? Personally I think the this solution would be good too, especially since this was one of the solutions I tried after the backticks didn't work as expected.

@max-sixty
Copy link
Member

Thanks for the offer re a PR for the docs — we'd def accept it.

If we do that, would be great to link to this issue and caveat — we don't think the current state is great, but this is a workaround...

@aljazerzen
Copy link
Member

if a machine wanted to generate PRQL, it would generate default_db.foo rather than from foo, which then doesn't look much like PRQL.

That tells me that we should either:

  • change the fact that tables are regular declarations in modules and make them something special, so we can have special name resolution rules for from and join functions, or
  • remove the from function and embrace the pipeline:
    default_db.my_table
    select {a, b}
    
    (we could have something else in place of default_db, maybe tables?)

@max-sixty
Copy link
Member

WDYT about having something like this also operate on tables, like from this.group to refer to a table named group? In the same way that derive x = this.group refers to a column named group?

this could also be tables if that's simpler — otherwise it could be quite unclear what this refers to in a join.

But is there a need to discard the from there?


  • remove the from function and embrace the pipeline:
    default_db.my_table select {a, b}
    (we could have something else in place of default_db, maybe tables?)

This was the very first proposal for PRQL — no from. But folks didn't like it — I think because having each line start with the verb of what it's doing is nice & consistent...

@snth
Copy link
Member

snth commented Aug 15, 2023

Coming late to this discussion but will give it some thought as well.

@aljazerzen aljazerzen added language-design Changes to PRQL-the-language and removed bug Invalid compiler output or panic labels Nov 9, 2023
@aljazerzen aljazerzen added the needs-discussion Undecided dilemma label Dec 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language-design Changes to PRQL-the-language needs-discussion Undecided dilemma
Projects
None yet
Development

No branches or pull requests

4 participants