Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Fix ManagedQuery segv: ctor/dtor order fix #3544

Merged
merged 6 commits into from
Jan 10, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions apis/python/tests/test_sparse_nd_array.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import contextlib
import datetime
import gc
import itertools
import json
import operator
Expand Down Expand Up @@ -1979,3 +1980,30 @@ def test_iter(tmp_path: pathlib.Path):
next(a)
with pytest.raises(StopIteration):
next(b)


def test_context_cleanup(tmp_path: pathlib.Path) -> None:
arrow_tensor = create_random_tensor("table", (1,), np.float32(), density=1)
with soma.SparseNDArray.create(
tmp_path.as_uri(), type=pa.float64(), shape=(1,)
) as write_arr:
write_arr.write(arrow_tensor)

def test(path, tiledb_config):
context = soma.SOMATileDBContext().replace(tiledb_config=tiledb_config)
X = soma.SparseNDArray.open(path, context=context, mode="r")
mq = X.read().tables()
return mq

for _ in range(100):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this loop here because the error is intermittent?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not intermittent in my setup. But Python doesn't give any real guarantees about GC behavior, so it is possible that multiple calls might be required to trip over it in some Python implementation. I was just being cautious - trying to be triple-sure we would eventually trigger GC of the reference holders

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave a code comment to that effect within the test case -- otherwise future readers may be puzzled ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

# Run test multiple times. While the C++ this tests (dtor order)
# is deterministic, it is triggered by the Python GC, which makes
# no specific guarantees about when it will sweep any given object.
_ = test(
tmp_path.as_uri(),
{
"vfs.s3.no_sign_request": "true",
"vfs.s3.region": "us-west-2",
},
)
gc.collect()
8 changes: 4 additions & 4 deletions libtiledbsoma/src/soma/managed_query.cc
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,8 @@ ManagedQuery::ManagedQuery(
std::shared_ptr<Array> array,
std::shared_ptr<Context> ctx,
std::string_view name)
: array_(array)
, ctx_(ctx)
: ctx_(ctx)
, array_(array)
, name_(name)
, schema_(std::make_shared<ArraySchema>(array->schema())) {
reset();
Expand All @@ -62,8 +62,8 @@ ManagedQuery::ManagedQuery(
std::unique_ptr<SOMAArray> array,
std::shared_ptr<Context> ctx,
std::string_view name)
: array_(array->arr_)
, ctx_(ctx)
: ctx_(ctx)
, array_(array->arr_)
, name_(name)
, schema_(std::make_shared<ArraySchema>(array->arr_->schema())) {
reset();
Expand Down
12 changes: 8 additions & 4 deletions libtiledbsoma/src/soma/managed_query.h
Original file line number Diff line number Diff line change
Expand Up @@ -98,8 +98,8 @@ class ManagedQuery {
ManagedQuery(const ManagedQuery&) = delete;

ManagedQuery(ManagedQuery&& other)
: array_(other.array_)
, ctx_(other.ctx_)
: ctx_(other.ctx_)
, array_(other.array_)
, name_(other.name_)
, schema_(other.schema_)
, query_(std::make_unique<Query>(*other.ctx_, *other.array_))
Expand Down Expand Up @@ -499,12 +499,16 @@ class ManagedQuery {
const CurrentDomain& current_domain, bool is_read);
void _fill_in_subarrays_if_dense_without_new_shape(bool is_read);

// TileDB array being queried.
std::shared_ptr<Array> array_;
// NB: the Array dtor REQUIRES that this context be alive, so member
johnkerl marked this conversation as resolved.
Show resolved Hide resolved
// declaration order is significant. Context (ctx_) MUST be declared
// BEFORE Array (array_) so that ctx_ will be destructed last.

// TileDB context object
std::shared_ptr<Context> ctx_;

// TileDB array being queried.
std::shared_ptr<Array> array_;

// Name displayed in log messages
std::string name_;

Expand Down
4 changes: 4 additions & 0 deletions libtiledbsoma/src/soma/soma_array.h
Original file line number Diff line number Diff line change
Expand Up @@ -1544,6 +1544,10 @@ class SOMAArray : public SOMAObject {
// SOMAArray name for debugging
std::string name_;

// NB: the Array dtor REQUIRES that this context be alive, so member
// declaration order is significant. Context (ctx_) MUST be declared
// BEFORE Array (arr_) so that ctx_ will be destructed last.

// SOMA context
std::shared_ptr<SOMAContext> ctx_;

Expand Down