Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bindings: UnknownSchema added data property #1799

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

natchar
Copy link

@natchar natchar commented Sep 27, 2024

Fixes #1234

Python bindings return dictionary-type python object for UnknownSchema data.

Copy link

linux-foundation-easycla bot commented Sep 27, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: natchar / name: Natchar (a8974b3)

Signed-off-by: Natchar Ratanasirigulchai <[email protected]>
@natchar natchar force-pushed the otio_unknownSchema_data branch from cabe1df to a8974b3 Compare September 27, 2024 23:07
@codecov-commenter
Copy link

codecov-commenter commented Sep 27, 2024

Codecov Report

Attention: Patch coverage is 88.88889% with 1 line in your changes missing coverage. Please review.

Project coverage is 81.61%. Comparing base (c0e97b0) to head (a8974b3).
Report is 22 commits behind head on main.

Files with missing lines Patch % Lines
...entimelineio-bindings/otio_serializableObjects.cpp 66.66% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1799      +/-   ##
==========================================
- Coverage   84.11%   81.61%   -2.51%     
==========================================
  Files         198      176      -22     
  Lines       22241    12344    -9897     
  Branches     4687     3024    -1663     
==========================================
- Hits        18709    10075    -8634     
+ Misses       2610     1732     -878     
+ Partials      922      537     -385     
Flag Coverage Δ
py-unittests 81.61% <88.88%> (-2.51%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/opentimelineio/unknownSchema.h 33.33% <100.00%> (+33.33%) ⬆️
tests/test_unknown_schema.py 90.47% <100.00%> (+2.24%) ⬆️
...entimelineio-bindings/otio_serializableObjects.cpp 93.51% <66.66%> (-0.35%) ⬇️

... and 66 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5184c36...a8974b3. Read the comment docs.

@jminor
Copy link
Collaborator

jminor commented Sep 30, 2024

This looks good to me. Once this lands, then this feature in Raven should be easy to implement: OpenTimelineIO/raven#26

@ssteinbach do you have any concerns with this?

@jminor jminor requested a review from ssteinbach September 30, 2024 15:58
@jmertic jmertic added the devdays24 Dev Days 2024 PRs/Issues label Sep 30, 2024
@@ -31,6 +31,11 @@ class UnknownSchema : public SerializableObject
return _original_schema_version;
}

AnyDictionary& data() noexcept
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the intention to provide mutable access? Reading the original issue it talks about accessing dictionaries presumably for reading. A reference opens the door to lifetime issues where the reference to the AnyDictionary can outlive the UnknownSchema object.

I think it would be better to return a copy of the AnyDictionary, even though that incurs overhead, and if we want mutability provide a method to supply a new dictionary.

Alternatively, we could change _data to be a shared pointer, but treating the AnyDictionary as a value type would give safer concurrency, if we also added a mutex during the copy and replace.

We don't have much consideration for concurrency in otio at the moment, but it would make sense to me to think about it moving forward.

Does anyone (@jminor ?) have a sense of whether these dictionaries might tend to be massive (dozens or hundreds of entries), or are they more on the order of a few entries?

If the former, I think I'd bias to a shared_ptr solution and ignore concurrency inside the object; if the latter, return/set by value pattern with a mutex in anticipation of concurrency.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@meshula Would a const & do the trick? Or you want a copy? And if its a copy should we call it data_copy() or something to disambiguate that you're not getting direct access to the child dictionary?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding size: a typical OTIO object should be small, with possibly dozens of JSON key/values in metadata. However, given that this class, UnknownSchema, is meant for user defined schema types, they really could be anything. This is the place where we're encouraging developers to add interesting new stuff, so we should be open-ended about what might be in there.

That leads me to support your suggestion that the API be defensive about UnknownSchema by providing read-only access. That will address the desired workflow (making UnknownSchema introspectable) without opening the door to meddling with the contents.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool :) @ssteinbach const& doesn't protect against object lifetime issues, or concurrent modification, which I'm worried about. It would work if we had a notion of a holder on the container that also blocked writing while the container is held, but we haven't got that :]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a way to modify the contents of an UnknownSchema object via our API, except via read_from which fully replaces the contents, so I suspect concurrency is not an issue. If a developer wants to modify these things, then they can register a schema and get a real object instead of UnknownSchema.

Would lifetime issues be handled by returning a copy as suggested earlier in this thread?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a copy solves lifetime issues :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, cool. @natchar could you alter data() to return a copy of _data?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is returning a copy of data, perhaps instead of it being a property, it should be a function definition so that the call is unknownSchema.data().

Maybe I'm missing something, but if I do change the function signature to AnyDictionary data() const noexcept that should return a copy of _data. However, in the python bindings we use AnyDictionaryProxy for AnyDictionary and running the test will yield Underlying C++ AnyDictionary has been destroyed. Is there another change I need to make to AnyDictionaryProxy?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's reporting that the copy has been destroyed, right? So I guess we need to look at the binding, and make sure that's it's doing what we expect, and then remove the warning... Any other thoughts?

Copy link
Collaborator

@jminor jminor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summarizing discussion thread: could you alter data() to return a copy of _data?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devdays24 Dev Days 2024 PRs/Issues python-bindings
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UnknownSchema objects should allow dictionary-like access to data fields
6 participants