Skip to content

Commit

Permalink
Review feedback changes
Browse files Browse the repository at this point in the history
* Guard function throw `duckdb::Exception(EXECUTOR,...)`
* Updated CONTRIBUTING.md according to review suggestion
* Guarded other suggested pg functions
  • Loading branch information
mkaruza committed Sep 19, 2024
1 parent e5768d5 commit 466d77e
Show file tree
Hide file tree
Showing 8 changed files with 42 additions and 46 deletions.
9 changes: 5 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,10 +128,11 @@ coding styles.
## Error Handling

* Use exceptions **only** when an error is encountered that terminates a query (e.g. parser error, table not found). Exceptions should only be used for **exceptional** situations. For regular errors that do not break the execution flow (e.g. errors you **expect** might occur) use a return value instead.
* *Exception* should be used in code that is executed inside duckdb engine.
* Use PostgreSQL *elog* API to report messages back to user. Using *ERROR* is strictly forbiden to use as it can brake execution and lead to
unexpected memory problems - *ERROR* is only used in duckdb node to early bail out execution in case of query preparation error or execution interruption.
* Calling PostgreSQL native functions needs to be used with *PostgresFunctionGuard*. *PostgresFunctionGuard* will handle correctly *ERROR* log messages that could be emmited from these functions.
* There are two distinct parts of the code where error handling is done very differently: The code that executes before we enter DuckDB execution engine (e.g. initial part of the planner hook) and the part that gets executed inside the duckdb execution engine. Below are rules for how to handle errors in both parts of the code. Not following these guidelines can cause crashes, memory leaks and other unexpected problems.
* Before we enter the DuckDB exection engine no exceptions should ever be thrown here. In cases where you would want to throw an exception here, use `elog(ERROR, ...)`. Any C++ code that might throw an exception is also problematic. Since C++ throws exceptions on allocation failures, this covers lots of C++ APIs. So try to use Postgres datastructures instead of C++ ones whenever possible (e.g. use `List` instead of `Vec`)
* Inside the duckdb execution engine the opposite is true. `elog(ERROR, ...)` should never be used there, use exceptions instead.
* Use PostgreSQL *elog* API can be used to report non-fatal messages back to user. Using *ERROR* is strictly forbiden to use in code that is executed inside the duckdb engine.
* Calling PostgreSQL native functions from within DuckDB execution needs **extreme care**. Pretty much non of these functions are thread-safe, and they might throw errors using `elog(ERROR, ...)`. If you've solved the thread-safety issue by taking a lock (or by carefully asserting that the actual code is thread safe), then you can use *PostgresFunctionGuard* to solve the `elog(ERROR, ...) problem. *PostgresFunctionGuard* will correctly handle *ERROR* log messages that could be emmited from these functions.
* Try to add test cases that trigger exceptions. If an exception cannot be easily triggered using a test case then it should probably be an assertion. This is not always true (e.g. out of memory errors are exceptions, but are very hard to trigger).
* Use `D_ASSERT` to assert. Use **assert** only when failing the assert means a programmer error. Assert should never be triggered by user input. Avoid code like `D_ASSERT(a > b + 3);` without comments or context.
* Assert liberally, but make it clear with comments next to the assert what went wrong when the assert is triggered.
Expand Down
27 changes: 21 additions & 6 deletions include/pgduckdb/pgduckdb_utils.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ extern "C" {
#include "postgres.h"
}

#include "duckdb/common/exception.hpp"

#include <vector>
#include <string>
#include <sstream>
#include <cstdio>
#include <optional>

namespace pgduckdb {

Expand All @@ -23,44 +23,59 @@ TokenizeString(char *str, const char delimiter) {
return v;
};

/*
* DuckdbGlobalLock should be held before calling.
*/
template <typename T, typename FuncType, typename... FuncArgs>
std::optional<T>
T
PostgresFunctionGuard(FuncType postgres_function, FuncArgs... args) {
T return_value;
bool error = false;
MemoryContext ctx = CurrentMemoryContext;
ErrorData *edata = nullptr;
// clang-format off
PG_TRY();
{
return_value = postgres_function(args...);
}
PG_CATCH();
{
MemoryContextSwitchTo(ctx);
edata = CopyErrorData();
FlushErrorState();
error = true;
}
PG_END_TRY();
// clang-format on
if (error) {
return std::nullopt;
throw duckdb::Exception(duckdb::ExceptionType::EXECUTOR, edata->message);
}
return return_value;
}

template <typename FuncType, typename... FuncArgs>
std::optional<bool>
void
PostgresFunctionGuard(FuncType postgres_function, FuncArgs... args) {
bool error = false;
MemoryContext ctx = CurrentMemoryContext;
ErrorData *edata = nullptr;
// clang-format off
PG_TRY();
{
postgres_function(args...);
}
PG_CATCH();
{
MemoryContextSwitchTo(ctx);
edata = CopyErrorData();
FlushErrorState();
error = true;
}
PG_END_TRY();
// clang-format on
return error;
if (error) {
throw duckdb::Exception(duckdb::ExceptionType::EXECUTOR, edata->message);
}
}

} // namespace pgduckdb
1 change: 0 additions & 1 deletion src/pgduckdb.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ extern "C" {

#include "pgduckdb/pgduckdb.h"
#include "pgduckdb/pgduckdb_node.hpp"
#include "pgduckdb/pgduckdb_utils.hpp"

static void DuckdbInitGUC(void);

Expand Down
22 changes: 7 additions & 15 deletions src/pgduckdb_detoast.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -115,26 +115,18 @@ ToastFetchDatum(struct varlena *attr) {
return result;
}

DuckdbProcessLock::GetLock().lock();
toast_rel = try_table_open(toast_pointer.va_toastrelid, AccessShareLock);
std::lock_guard<std::mutex> lock(DuckdbProcessLock::GetLock());

if (toast_rel != NULL) {
DuckdbProcessLock::GetLock().unlock();
throw duckdb::InternalException("(PGDuckDB/ToastFetchDatum) Error opening toast relation");
}
toast_rel = PostgresFunctionGuard<Relation>(try_table_open, toast_pointer.va_toastrelid, AccessShareLock);

bool error_fetch_toast = false;
if (PostgresFunctionGuard(table_relation_fetch_toast_slice, toast_rel, toast_pointer.va_valueid, attrsize, 0,
attrsize, result).value()) {
error_fetch_toast = true;
if (toast_rel == NULL) {
throw duckdb::InternalException("(PGDuckDB/ToastFetchDatum) Error toast relation is NULL");
}

table_close(toast_rel, AccessShareLock);
DuckdbProcessLock::GetLock().unlock();
PostgresFunctionGuard(table_relation_fetch_toast_slice, toast_rel, toast_pointer.va_valueid, attrsize, 0, attrsize,
result);

if (error_fetch_toast) {
throw duckdb::InternalException("(PGDuckDB/ToastFetchDatum) Error reading external toast table");
}
PostgresFunctionGuard(table_close, toast_rel, AccessShareLock);

return result;
}
Expand Down
4 changes: 2 additions & 2 deletions src/pgduckdb_node.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -81,12 +81,12 @@ ExecuteQuery(DuckdbScanState *state) {
CleanupDuckdbScanState(state);
// Process the interrupt on the Postgres side
ProcessInterrupts();
elog(ERROR, "(PGDuckDB/ExecuteQuery) Query cancelled");
elog(ERROR, "Query cancelled");
}
} while (!duckdb::PendingQueryResult::IsResultReady(execution_result));
if (execution_result == duckdb::PendingExecutionResult::EXECUTION_ERROR) {
CleanupDuckdbScanState(state);
elog(ERROR, "(PGDuckDB/ExecuteQuery) Query execution returned an error: %s", pending->GetError().c_str());
elog(ERROR, "(PGDuckDB/ExecuteQuery) %s", pending->GetError().c_str());
}
query_results = pending->Execute();
state->column_count = query_results->ColumnCount();
Expand Down
19 changes: 4 additions & 15 deletions src/scan/heap_reader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -87,25 +87,14 @@ HeapReader::ReadPageTuples(duckdb::DataChunk &output) {
while (block != InvalidBlockNumber) {
if (m_read_next_page) {
CHECK_FOR_INTERRUPTS();
DuckdbProcessLock::GetLock().lock();
std::lock_guard<std::mutex> lock(DuckdbProcessLock::GetLock());
block = m_block_number;

auto opt_buffer = PostgresFunctionGuard<Buffer>(ReadBufferExtended, m_relation, MAIN_FORKNUM, block,
RBM_NORMAL, GetAccessStrategy(BAS_BULKREAD));

if (!opt_buffer.has_value()) {
DuckdbProcessLock::GetLock().unlock();
throw duckdb::InternalException("(PGDuckdDB/ReadPageTuples) ReadBufferExtended failed");
}

m_buffer = opt_buffer.value();
m_buffer = PostgresFunctionGuard<Buffer>(ReadBufferExtended, m_relation, MAIN_FORKNUM, block, RBM_NORMAL,
GetAccessStrategy(BAS_BULKREAD));

if (PostgresFunctionGuard(LockBuffer, m_buffer, BUFFER_LOCK_SHARE).value()) {
DuckdbProcessLock::GetLock().unlock();
throw duckdb::InternalException("(PGDuckdDB/ReadPageTuples) LockBuffer failed");
}
PostgresFunctionGuard(LockBuffer, m_buffer, BUFFER_LOCK_SHARE);

DuckdbProcessLock::GetLock().unlock();
page = PreparePageRead();
m_read_next_page = false;
}
Expand Down
4 changes: 2 additions & 2 deletions test/regression/expected/array_type_support.out
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ INSERT INTO int_array_2d VALUES
('{{11, 12, 13}, {14, 15, 16}}'),
('{{17, 18}, {19, 20}}');
SELECT * FROM int_array_2d;
ERROR: (PGDuckDB/ExecuteQuery) Query execution returned an error: Invalid Input Error: Dimensionality of the schema and the data does not match, data contains more dimensions than the amount of dimensions specified by the schema
ERROR: (PGDuckDB/ExecuteQuery) Invalid Input Error: Dimensionality of the schema and the data does not match, data contains more dimensions than the amount of dimensions specified by the schema
drop table int_array_2d;
-- INT4 (single dimensional data, two dimensionsal type)
CREATE TABLE int_array_2d(a INT[][]);
Expand All @@ -44,7 +44,7 @@ INSERT INTO int_array_2d VALUES
('{11, 12, 13}'),
('{17, 18}');
SELECT * FROM int_array_2d;
ERROR: (PGDuckDB/ExecuteQuery) Query execution returned an error: Invalid Input Error: Dimensionality of the schema and the data does not match, data contains fewer dimensions than the amount of dimensions specified by the schema
ERROR: (PGDuckDB/ExecuteQuery) Invalid Input Error: Dimensionality of the schema and the data does not match, data contains fewer dimensions than the amount of dimensions specified by the schema
drop table int_array_2d;
-- INT4 (two dimensional data and type)
CREATE TABLE int_array_2d(a INT[][]);
Expand Down
2 changes: 1 addition & 1 deletion test/regression/expected/execution_error.out
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,6 @@ insert into int_as_varchar SELECT * from (
('abc')
) t(a);
select a::INTEGER from int_as_varchar;
ERROR: (PGDuckDB/ExecuteQuery) Query execution returned an error: Conversion Error: Could not convert string 'abc' to INT32
ERROR: (PGDuckDB/ExecuteQuery) Conversion Error: Could not convert string 'abc' to INT32
LINE 1: SELECT (a)::integer AS a FROM int_as_varchar
^

0 comments on commit 466d77e

Please sign in to comment.