-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicitly convert datum values before filter operation #194
Conversation
95fb4d9
to
89fc42d
Compare
test/regression/sql/query_filter.sql
Outdated
SELECT COUNT(*) = 1 FROM query_filter_float WHERE a < 1.0; | ||
SELECT COUNT(*) = 2 FROM query_filter_float WHERE a <= 1.0; | ||
SELECT COUNT(*) = 2 FROM query_filter_float WHERE a < 1.1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my experience you should generally avoid returning booleans from queries in pgregress tests, because it makes it harder to debug the test when it starts failing. With your current code you only know that the test failed, but not why. With the following, you could see how many rows were found, which can be helpful with debugging.
SELECT COUNT(*) = 1 FROM query_filter_float WHERE a < 1.0; | |
SELECT COUNT(*) = 2 FROM query_filter_float WHERE a <= 1.0; | |
SELECT COUNT(*) = 2 FROM query_filter_float WHERE a < 1.1; | |
SELECT COUNT(*) FROM query_filter_float WHERE a < 1.0; | |
SELECT COUNT(*) FROM query_filter_float WHERE a <= 1.0; | |
SELECT COUNT(*) FROM query_filter_float WHERE a < 1.1; |
@@ -55,9 +55,9 @@ FilterOperationSwitch(Datum &value, duckdb::Value &constant, Oid type_oid) { | |||
case INT8OID: | |||
return TemplatedFilterOperation<int64_t, OP>(value, constant); | |||
case FLOAT4OID: | |||
return TemplatedFilterOperation<float, OP>(value, constant); | |||
return TemplatedFilterOperation<float, OP>(DatumGetFloat4(value), constant); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we should probably do this explicit conversion for all types here, not just for floats.
89fc42d
to
b1ebe29
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two minor suggestions, other than that this is good IMO.
src/pgduckdb_filter.cpp
Outdated
case DATEOID: { | ||
Datum date_datum = static_cast<int32_t>(value + pgduckdb::PGDUCKDB_DUCK_DATE_OFFSET); | ||
return TemplatedFilterOperation<int32_t, OP>(date_datum, constant); | ||
int32_t date = static_cast<int32_t>(value + pgduckdb::PGDUCKDB_DUCK_DATE_OFFSET); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int32_t date = static_cast<int32_t>(value + pgduckdb::PGDUCKDB_DUCK_DATE_OFFSET); | |
int32_t date = static_cast<int32_t>(DatumGetDate(value) + pgduckdb::PGDUCKDB_DUCK_DATE_OFFSET); |
src/pgduckdb_filter.cpp
Outdated
} | ||
case TIMESTAMPOID: { | ||
Datum timestamp_datum = static_cast<int64_t>(value + pgduckdb::PGDUCKDB_DUCK_TIMESTAMP_OFFSET); | ||
return TemplatedFilterOperation<int64_t, OP>(timestamp_datum, constant); | ||
int64_t timestamp = static_cast<int64_t>(value + pgduckdb::PGDUCKDB_DUCK_TIMESTAMP_OFFSET); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int64_t timestamp = static_cast<int64_t>(value + pgduckdb::PGDUCKDB_DUCK_TIMESTAMP_OFFSET); | |
int64_t timestamp = static_cast<int64_t>(DatumGetTimestamp(value) + pgduckdb::PGDUCKDB_DUCK_TIMESTAMP_OFFSET); |
I updated the PR description with some additional info/explanation of the problem. |
Datums can contain a pass-by-value or a pass-by-reference type. We were simply casting the Datum for a pass by value type, but that is not always correct. Specifically for floating point numbers this would not always result in the correct floating point number. And int64 values are pass-by-reference on 32bit systems, but pass-by-value on 64bit systems. Postgres provides the DatumGetXyz macros/functions to work around these problems. This starts using these functions in FilterOperationSwitch to avoid these casting problems there.
b1ebe29
to
49f9c50
Compare
Update commit message. |
Afaict the repo is configured to use the PR description as the commit message when merging by default, when doing a squash-merge. That's why I usually like to the PR description to explain a bit what the PR is doing before merging. (it's even better if the PR description is helpful when opening the PR ofcourse, to make it easy for reviewing, but in this case I was able to understand what the issue was from context I had). |
Datums can contain a pass-by-value or a pass-by-reference type. We were simply casting the
Datum
for a pass by value type, but that is not always correct. Specifically for floating point numbers this would not always result in the correct floating point number. And int64 values are pass-by-reference on 32bit systems, but pass-by-value on 64bit systems. Postgres provides theDatumGetXyz
macros/functions to work around these problems. This starts using these functions inFilterOperationSwitch
to avoid these casting problems there.Fixes #189