Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harden against Postgres interactions with C++ destructors #254

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Y--
Copy link
Collaborator

@Y-- Y-- commented Oct 7, 2024

Fixes #93

  • make sure that no function uses elog(ERROR... which would bypass C++ dtors, and instead throw regular exceptions
  • wraps every PG hooks in a try/catch block that converts the C++ exception to an PG equivalent (elog(ERROR, ... form)

@Y-- Y-- requested a review from JelteF October 7, 2024 13:42
Copy link
Collaborator

@hlinnaka hlinnaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At quick glance, there are calls to get_namespace_oid() and get_relname_relid() in DuckDBManager::CheckSecretsSeq() that also seem unsafe.

Is there a general plan for how to sanity check that each function is handling errors correctly. If I understand correctly, some code runs in "Postgres regime", where you use elog(ERROR), memory context etc, while other code runs in "duckdb regime" where you use c++ exceptions etc. And whenever you cross the boundary, you must use PostgresFunctionGuard or DuckDBFunctionGuard.

Can we have some kind of compiler support or conventions or something to help with getting that right?

Comment on lines 191 to 127
appendStringInfo(duckdb_extension, "LOAD %s;", extension.name.c_str());
auto res = context.Query(duckdb_extension->data, false);
if (res->HasError()) {
elog(ERROR, "(PGDuckDB/LoadExtensions) `%s` could not be loaded with DuckDB", extension.name.c_str());
}
DuckDBQueryOrThrow(context, "LOAD " + extension.name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think DuckDBManager::DropSecrets needs a similar change

@Y-- Y-- marked this pull request as draft October 10, 2024 12:30
@Y--
Copy link
Collaborator Author

Y-- commented Oct 10, 2024

At quick glance, there are calls to get_namespace_oid() and get_relname_relid() in DuckDBManager::CheckSecretsSeq() that also seem unsafe.

Is there a general plan for how to sanity check that each function is handling errors correctly. If I understand correctly, some code runs in "Postgres regime", where you use elog(ERROR), memory context etc, while other code runs in "duckdb regime" where you use c++ exceptions etc. And whenever you cross the boundary, you must use PostgresFunctionGuard or DuckDBFunctionGuard.

Can we have some kind of compiler support or conventions or something to help with getting that right?

Thanks for the review @hlinnaka ! There are indeed a lot more places that requires attention. (I didn't realize the PR wasn't in draft - updated now)

My goal is to get to a place where we:

  • use C++ as much as possible in this codebase
  • wrap non-trivial PG function calls with PostgresFunctionGuard to catch PG errors and throw C++ exceptions
  • wrap every top-level function (hooks, etc.) to catch C++ exceptions and convert back to PG errors

What do you think about this approach?

@hlinnaka
Copy link
Collaborator

My goal is to get to a place where we:

  • use C++ as much as possible in this codebase

  • wrap non-trivial PG function calls with PostgresFunctionGuard to catch PG errors and throw C++ exceptions

  • wrap every top-level function (hooks, etc.) to catch C++ exceptions and convert back to PG errors

What do you think about this approach?

Yep, that works. Greenplum's orca optimizer does something similar, see e.g. https://github.com/cloudberrydb/cloudberrydb/blob/main/src/backend/gpopt/gpdbwrappers.cpp. I think PostGIS too.

@Y-- Y-- force-pushed the yl/handle-cpp-exceptions branch 2 times, most recently from f1286bf to 9cad7d7 Compare October 11, 2024 18:28
@JelteF JelteF added this to the 0.2.0 milestone Oct 15, 2024
@JelteF
Copy link
Collaborator

JelteF commented Oct 15, 2024

Moved this to the 0.2.0 milestone, to allow us to focus on stability testing for 0.1.0. We might still merge part of this PR for 0.1.0 where necessary to resolve stability issues we find during testing, but for now merging this wholesale so close to the release seems more risky than postponing it. We'll try to merge this (and related fixes) early in the 0.2.0 release cycle so it can bake longer.

@Y-- Y-- force-pushed the yl/handle-cpp-exceptions branch 4 times, most recently from d73eaca to 19c6e55 Compare October 30, 2024 09:24
src/pgduckdb_ddl.cpp Outdated Show resolved Hide resolved
@@ -561,8 +558,9 @@ duckdb_alter_table_trigger(PG_FUNCTION_ARGS) {
SetUserIdAndSecContext(saved_userid, sec_context);
AtEOXact_GUC(false, save_nestlevel);

if (ret != SPI_OK_SELECT)
elog(ERROR, "SPI_exec failed: error code %s", SPI_result_code_string(ret));
if (ret != SPI_OK_SELECT) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be converted to SPI_exec_or_throw afaict.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm don't we want to reset the user id etc (L558-559) before throwing? I thought about adding a version with a callback to execute before throwing but that felt overkill.

include/pgduckdb/pgduckdb_utils.hpp Outdated Show resolved Hide resolved
Datum is_temporary_datum = SPI_getbinval(tuple, SPI_tuptable->tupdesc, 2, &isnull);
if (isnull) {
elog(ERROR, "Expected temporary boolean to be returned, but found NULL");
throw std::runtime_error("Expected temporary boolean to be returned, but found NULL");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discussed this in person, but I'll write it down here as well:

I'm not entirely convinced yet that changing all the elogs to exceptions in this file and in pgduckdb_ruleutils.cpp is all that helpful. It doesn't hurt either functionally, because you've wrapped this fuction in an InvokeCPPFunc call, and there's no PG_TRY around them. So there is no problem with throwing exceptions now.

But I think these specific functions probably won't benefit from being rewritten to C++ completely, and would be easier to maintain in Postgres C style. Mainly because they call a lot of Postgres functions and very few DuckDB functions. Wrapping all of those Postgres calls in PostgresFunctionGuard calls doesn't necessarily sound like it makes the code easier to read and maintain. I'm happily proven wrong though.

If you're fairly sure this is the right direction for these functions I'm fine with merging this now. But I think it should be a separate commit from most of the other changes (specifically the new and improved macros). To make reverting it easier if it turns out we want to make these functions fully Postgres C instead of fully C++.

src/pgduckdb_table_am.cpp Outdated Show resolved Hide resolved
@Y-- Y-- force-pushed the yl/handle-cpp-exceptions branch 2 times, most recently from 4049046 to bb35b1a Compare October 30, 2024 13:53
This was referenced Oct 30, 2024
@Y-- Y-- force-pushed the yl/handle-cpp-exceptions branch 2 times, most recently from c652a04 to feebb13 Compare October 30, 2024 16:45
@Y-- Y-- force-pushed the yl/handle-cpp-exceptions branch 2 times, most recently from c7b9261 to 5731604 Compare October 31, 2024 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Harden against Postgres interactions with C++ destructors
3 participants