Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windowing] Handle null values in partitions #141

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion internal/function_window_option.go
Original file line number Diff line number Diff line change
Expand Up @@ -274,6 +274,9 @@ type WindowFuncStatus struct {
OrderBy []*WindowOrderBy
}

// windowNilPartitionValue Placeholder value for nil
const windowNilPartitionValue = StringValue("^^^ZETASQLITE_NIL^^^")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@goccy While exploring other issues, it seems this case comes up often.

Does it make sense to introduce a new NilValue class instead?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NULL is not the first class's value. So I want to deal with this by handling null properly.


func (s *WindowFuncStatus) Partition() (string, error) {
partitions := make([]string, 0, len(s.Partitions))
for _, p := range s.Partitions {
Expand Down Expand Up @@ -314,7 +317,11 @@ func parseWindowOptions(args ...Value) ([]Value, *WindowFuncStatus, error) {
case WindowFuncOptionEnd:
opt.End = v.Value.(*WindowBoundary)
case WindowFuncOptionPartition:
opt.Partitions = append(opt.Partitions, v.Value.(Value))
if v.Value != nil {
opt.Partitions = append(opt.Partitions, v.Value.(Value))
} else {
opt.Partitions = append(opt.Partitions, windowNilPartitionValue)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just use nil instead of windowNilPartitionValue ?

Copy link
Contributor Author

@ohaibbq ohaibbq Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would strings.Join("_", []string{"a", nil, "c"}) evaluate to?

The intention is that windowed rows that share common nil columns are partitioned together.

WITH toks as (
  SELECT 'a' as prop1, null as prop2, 'aa' as prop3
  UNION ALL SELECT 'a' as prop1, 'aa' as prop2, null as prop3
)
SELECT prop1, ROW_NUMBER() over (PARTITION BY prop2, prop3) AS rn FROM toks

Expected:

prop1 rn
a 1
a 1

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A guard expression against nil must be inserted before the join process.

func (s *WindowFuncStatus) Partition() (string, error) {
	partitions := make([]string, 0, len(s.Partitions))
	for _, p := range s.Partitions {
                if p == nil {
                  continue
                }
		text, err := p.ToString()
		if err != nil {
			return "", err
		}
		partitions = append(partitions, text)
	}
	return strings.Join(partitions, "_"), nil
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the query above, wouldn't that result in the following?

prop1 rn
a 1
a 2

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Then it looks like we should add constant NullValue structure.

Example.

  • value.go
type NullValue struct {
  zeroValue Value
}

NullValue should have the original type information, so it should either have a zero value or reflect.Type .
If NULL value is integer type, initialize as follows

&NullValue{zeroValue: IntValue(0)}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to do this as I think it is the right approach, too.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you !! Generics seems to be a better choice. 😃

type NullValue[T Value] struct {
  zeroValue T
}

}
case WindowFuncOptionRowID:
opt.RowID = v.Value.(int64)
case WindowFuncOptionOrderBy:
Expand Down
4 changes: 3 additions & 1 deletion query_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1592,14 +1592,16 @@ WITH finishers AS
UNION ALL SELECT 'Jen Edwards', TIMESTAMP '2016-10-18 3:06:36+00', 'F30-34'
UNION ALL SELECT 'Meghan Lederer', TIMESTAMP '2016-10-18 3:07:41+00', 'F30-34'
UNION ALL SELECT 'Carly Forte', TIMESTAMP '2016-10-18 3:08:58+00', 'F25-29'
UNION ALL SELECT 'Lauren Reasoner', TIMESTAMP '2016-10-18 3:10:14+00', 'F30-34')
UNION ALL SELECT 'Lauren Reasoner', TIMESTAMP '2016-10-18 3:10:14+00', 'F30-34'
UNION ALL SELECT 'Nilly Nada', TIMESTAMP '2016-10-18 3:10:14+00', null)
SELECT name,
FORMAT_TIMESTAMP('%X', finish_time) AS finish_time,
division,
LEAD(name, 2, 'Nobody')
OVER (PARTITION BY division ORDER BY finish_time ASC) AS two_runners_back
FROM finishers`,
expectedRows: [][]interface{}{
{"Nilly Nada", "03:10:14", nil, "Nobody"},
{"Carly Forte", "03:08:58", "F25-29", "Nobody"},
{"Sophia Liu", "02:51:45", "F30-34", "Jen Edwards"},
{"Nikki Leith", "02:59:01", "F30-34", "Meghan Lederer"},
Expand Down
Loading