-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support extended list types #282
Conversation
80f9f13
to
a3b9268
Compare
Thanks a lot! This looks great. We're currently preparing for a 0.1.0 release, and given the size of the PR we probably don't want to squeeze it in before that release (we're going to be mostly working on stabilizing what we have soon). But this looks like a good candidate to merge early for the release cycle after that one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some quick cursory review. I'll look at the actual code later.
Tests are failing on PG17. Seemingly because of a changed error message string. If you cannot change the test to create a consistent error message across postgres versions, then it's probably easiest to create a Python test in test/pycheck
directory. That way you can change the expected error message based on the postgres version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this seems like a good change. I quite like the refactoring that is done. The only real problem with this implementation is the current usage of SearchSysCache1
, which is not thread-safe.
template <int32_t OID> | ||
struct PostgresIntegerOIDMapping {}; | ||
static Datum | ||
ConvertCharDatum(const duckdb::Value &value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I very much like these new functions. I think taking that refactor on its own would make the code already a lot better.
numeric_var = ConvertNumeric<int64_t>(value.GetValueUnsafe<int64_t>(), scale); | ||
break; | ||
} | ||
case duckdb::PhysicalType::UINT64: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you didn't write this code, just moved it around. But reading this now it makes me wonder. Why is this the only unsigned integer type that we check for the physical type? @mkaruza do you know?
src/pgduckdb_types.cpp
Outdated
@@ -236,6 +396,16 @@ struct PostgresArrayAppendState { | |||
auto &values = duckdb::ListValue::GetChildren(value); | |||
idx_t to_append = values.size(); | |||
|
|||
if (to_append == 0 && dimension < number_of_dimensions-1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be new code. Could you explain in a comment what edge case this is handling? Because that is not super clear from me from just the code. Also I'm curious why this was not necessary before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Use another way to solve this issues.
The origin issue is when we insert empty array into multi-dimension array. The dimension[1]
to dimension[number_of_dimensions-1]
can not be set because of the logic of AppendValueAtDimension
function. So we use the dimension
to call the function ConstructArray
. It will cause overflow. The reason why it not necessary before is lack of the test cases. When you insert empty array into multi-dimension array. It will happen.
src/pgduckdb_types.cpp
Outdated
if (append_state.expected_values > 0) { | ||
pfree(datums); | ||
pfree(nulls); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this change should be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think the pfree
itself can handle this. But when I remove the if condition, and run the tests. I'll get the
CREATE TABLE char_array_2d(a CHAR(1)[][]);
INSERT INTO char_array_2d VALUES
('{{"a","b"},{"c","d"}}'),
('{{"e","f","g"},{"h","i","j"}}'),
(NULL),
('{{"k","l"},{"m",NULL}}'),
('{}');
SELECT * FROM char_array_2d;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lost
. That's why I add this. It is a little bit strange.
8a99eba
to
8445d88
Compare
remove the useless change for clean remove the useless change for clean remove the useless change for clean
8445d88
to
c20ea93
Compare
Are there any easy way to run the tests in different postgres version in local? I missed the error message in PG17. Can you help me to trigger the tests again? Thanks! |
@JelteF Thanks for your review! All the comments(except the test failed in PG17) are address. PTAL |
Regarding the test failure in PG17, which happens due to a different error message in that PG version. The approach we take so far for cases like this is to move this test to a Python test in the |
I known your meaning. But find the test cases can pass now. Just wonder will you run the regression tests in different postgres version when you develop some new features? |
Ah if it passes now you can disregard my comment. |
src/pgduckdb_types.cpp
Outdated
template <> | ||
Datum | ||
PostgresOIDMapping<BOOLOID>::ToDatum(const duckdb::Value &val) { | ||
return ConvertBoolDatum(val); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These overrides are unnecessary afaict, the default implementation for PostgresOIDMapping
will now do the correct thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JelteF Yes, you are right. Remove the useless part. PTAL.
src/pgduckdb_types.cpp
Outdated
static constexpr bool typbyval = true; | ||
static constexpr char typalign = 'c'; | ||
|
||
static Datum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's mark all these ToDatum
functions as static inline
. They are so tiny that there's no need for introducing calling overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do the same for some of the smaller Convert
functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JelteF Updated. Thanks for your review. PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now. @Y-- do you have some thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great refactoring + added list support.
I can think of a few more cleanup/refactoring we could do in a separated PR like:
- split functions that have big
switch/case
to return in each case and handle the type mapping/dimensionality separately - use C++ vectors/allocator & raii pattern for allocations rather than
palloc
Overall a lot of good stuff, thanks for the PR!
@@ -301,7 +540,6 @@ ConvertDuckToPostgresArray(TupleTableSlot *slot, duckdb::Value &value, idx_t col | |||
D_ASSERT(value.type().id() == duckdb::LogicalTypeId::LIST); | |||
auto &child_type = GetChildTypeRecursive(value.type()); | |||
auto child_id = child_type.id(); | |||
D_ASSERT(child_id == OP::ExpectedType()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you've removed the D_ASSERT
, you can now remove the unused variable child_id
?
src/pgduckdb_types.cpp
Outdated
@@ -192,6 +169,269 @@ ConvertNumeric(T value, idx_t scale) { | |||
return result; | |||
} | |||
|
|||
static Datum | |||
ConvertNumericDatum(const duckdb::Value &value) { | |||
if (value.type().id() == duckdb::LogicalTypeId::DOUBLE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we could assign value.type().id()
to a const here and re-use it a lot in this function
@Y-- Updated. Thanks for your review. PTAL. |
I think you are right. After this PR has been merged. I'd like to do this. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Related to #101
This pull request adds more array types. The changes include:
Implemented the PostgresOIDMapping structure to streamline type-specific operations.