Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the PostgresStorageExtension to DuckDB #97

Merged
merged 36 commits into from
Sep 24, 2024
Merged

Conversation

Tishj
Copy link
Collaborator

@Tishj Tishj commented Aug 7, 2024

This PR aims to implement #56

It is still very barebones, a lot of the API's virtual methods are not implemented, but all existing tests are passing.

The existing functionality has been preserved as best as possible:

  • the PostgresIndexTable TableCatalogEntry will use the PostgresIndexScanFunction
  • the PostgresHeapTable TableCatalogEntry will use the PostgresSeqScanFunction
  • VIEWs are still handled by the ReplacementScan, replacing the view with its view_definition, which will get bound again and hit a PostgresIndexTable / PostgresHeapTable

I could find no tests for the PostgresIndexScanFunction, it's possible that functionality has been broken, but I've tried to keep as much of the existing code in tact.

@Tishj
Copy link
Collaborator Author

Tishj commented Aug 8, 2024

Now that the duckdb connection is a singleton this needs some more work, as it currently assumes the catalog to be thrown away after every query.
The created catalog entries need to be part of the Transaction now so they can be cleared on commit

@wuputah
Copy link
Collaborator

wuputah commented Aug 8, 2024

those changes aren't on main, but it is something to think about going forward.

@JelteF
Copy link
Collaborator

JelteF commented Aug 8, 2024

those changes aren't on main, but it is something to think about going forward.

For clarity, @Y-- and me are working together for a POC that can be used to show motherduck. But those PRs are not being merged in main yet (they are not nearly ready yet for actual usage)

@wuputah wuputah mentioned this pull request Aug 13, 2024
@JelteF
Copy link
Collaborator

JelteF commented Aug 19, 2024

Can you fix the merge conflicts with master? Then it should be much easier to try out. Also when compiling this PR locally I see a bunch of warnings.

include/pgduckdb/catalog/pgduckdb_table.hpp Show resolved Hide resolved
src/catalog/pgduckdb_catalog.cpp Outdated Show resolved Hide resolved
src/catalog/pgduckdb_transaction.cpp Outdated Show resolved Hide resolved
src/pgduckdb_planner.cpp Outdated Show resolved Hide resolved
src/scan/postgres_scan.cpp Outdated Show resolved Hide resolved
@Tishj
Copy link
Collaborator Author

Tishj commented Sep 23, 2024

@JelteF please review and let me know what the remaining blockers are, it's been open quite a while now.
I'd like to stop having to fix this with every PR that gets merged

@Tishj
Copy link
Collaborator Author

Tishj commented Sep 23, 2024

when I merge with main I can't resolve a crash in test/regression/sql/query_filter.sql
SELECT a, c FROM query_filter_output_column WHERE b = 't1'; is what triggers it, it's a EXC_BAD_ACCESS somewhere in the query profiler.

I suspect that the query data gets prematurely deleted, it's duckdb data it's trying and failing to access.
I don't know why this query specifically is causing the problem, I know there was recently work done on filter / projection pushdown?

src/pgduckdb_planner.cpp Outdated Show resolved Hide resolved
test/regression/Makefile Outdated Show resolved Hide resolved
@JelteF
Copy link
Collaborator

JelteF commented Sep 23, 2024

when I merge with main I can't resolve a crash in test/regression/sql/query_filter.sql
SELECT a, c FROM query_filter_output_column WHERE b = 't1'; is what triggers it, it's a EXC_BAD_ACCESS somewhere in the query profiler.

Did you fix this? Because I see you merged with main afterwards, and tests seem to pass now on CI.

Copy link
Collaborator

@JelteF JelteF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think this is pretty much good to merge. Left some final comments. The major one being the (seemingly) incorrect merge. But all should be easy to address afaict.

@JelteF
Copy link
Collaborator

JelteF commented Sep 23, 2024

I'd like to stop having to fix this with every PR that gets merged

Totally makes sense. Let's merge this quickly now.

@Tishj
Copy link
Collaborator Author

Tishj commented Sep 24, 2024

when I merge with main I can't resolve a crash in test/regression/sql/query_filter.sql
SELECT a, c FROM query_filter_output_column WHERE b = 't1'; is what triggers it, it's a EXC_BAD_ACCESS somewhere in the query profiler.

Did you fix this? Because I see you merged with main afterwards, and tests seem to pass now on CI.

it's not, I can reproduce this on main and the PR that introduced it is #195

@Tishj Tishj mentioned this pull request Sep 24, 2024
@Mytherin Mytherin merged commit 897bd5c into main Sep 24, 2024
3 checks passed
@Mytherin Mytherin deleted the postgres_catalog_extension branch September 24, 2024 07:36
JelteF added a commit that referenced this pull request Sep 24, 2024
I think @Tishj misunderstood my review comment in #97 about the merge
being incorrect. I meant that the whole `pgduckdb_pg_get_querydef` stuff
could be removed from `DuckdbPlanNode`, since that is now moved to
`DuckdbPrepare`.
JelteF added a commit that referenced this pull request Sep 24, 2024
I think @Tishj misunderstood my review comment in #97 about the merge
being incorrect. I meant that the whole `pgduckdb_pg_get_querydef` stuff
could be removed from `DuckdbPlanNode`, since that is now moved to
`DuckdbPrepare`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants