Skip to content

Group by incorrectly works with alias. #2430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
comphead opened this issue May 3, 2022 · 6 comments · Fixed by #2457
Closed

Group by incorrectly works with alias. #2430

comphead opened this issue May 3, 2022 · 6 comments · Fixed by #2457
Assignees
Labels
bug Something isn't working

Comments

@comphead
Copy link
Contributor

comphead commented May 3, 2022

Describe the bug
Group by incorrectly works with alias. The query fails with "Projection references non-aggregate values"
`

To Reproduce

async fn csv_query_group_by_substr() -> Result<()> {
    let ctx = SessionContext::new();
    register_aggregate_csv(&ctx).await?;
    let sql = "SELECT substr(c1, 1, 1) c1 \
        FROM aggregate_test_100 \
        GROUP BY substr(c1, 1, 1) \
        ";
    let actual = execute_to_batches(&ctx, sql).await;
    let expected = vec![
        "+----+-----------------------------+",
        "| c1 | AVG(aggregate_test_100.c12) |",
        "+----+-----------------------------+",
        "| a  | 0.48754517466109415         |",
        "| b  | 0.41040709263815384         |",
        "| c  | 0.6600456536439784          |",
        "| d  | 0.48855379387549824         |",
        "| e  | 0.48600669271341534         |",
        "+----+-----------------------------+",
    ];
    assert_batches_sorted_eq!(expected, &actual);
    Ok(())
}

Expected behavior
Test should pass

Additional context
None

@comphead comphead added the bug Something isn't working label May 3, 2022
@comphead
Copy link
Contributor Author

comphead commented May 3, 2022

@andygrove can this be assigned to me?

@andygrove
Copy link
Member

@comphead I have assigned this to you. I have spent some time working on this and keep running into problems with the way we try and compute names and determine type information based on an expression and a schema. I will be interested to see what you find.

I am now going to start working on #1468 and see where that takes me. If the expressions carry their names with them then perhaps it helps with some of these issues? I am still learning and exploring ...

@comphead
Copy link
Contributor Author

comphead commented May 4, 2022

2 bugs here.

  • resolve_aliases_to_exprs(&group_by_expr, &alias_map)?; which does wrong resolving and breaks expr to substr(substr(aggregate_test_100.c1,Int64(1),Int64(1)),Int64(1),Int64(1)) instead of substr(aggregate_test_100.c1,Int64(1),Int64(1))
  • check_columns_satisfy_exprs that gets confused because of alias the same as col name in agg expression

@comphead
Copy link
Contributor Author

comphead commented May 4, 2022

@comphead I have assigned this to you. I have spent some time working on this and keep running into problems with the way we try and compute names and determine type information based on an expression and a schema. I will be interested to see what you find.

I am now going to start working on #1468 and see where that takes me. If the expressions carry their names with them then perhaps it helps with some of these issues? I am still learning and exploring ...

@andygrove I'm not sure this will solve all the problems. Currently the test like select count(1), count(1) from t will fail because the columns aliases are the same. Currently the name gets generated based on colname + function + types. To diff them you may want include some increment like column position, or reference to current query block hashcode, or other unique constituent

@andygrove
Copy link
Member

@comphead I have the test passing in the WIP PR #2457

I ran into the solution whilst working on ROLLUP support

@comphead
Copy link
Contributor Author

comphead commented May 6, 2022

Thank you, I was trying to solve it in planner but your proposed solution looks more robust! Looking forward to test it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants