Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lifecycle improvements #47

Open
wants to merge 46 commits into
base: develop
Choose a base branch
from
Open

Conversation

Deric-W
Copy link
Contributor

@Deric-W Deric-W commented Apr 12, 2023

Hello,
some time ago I proposed some changes in #38 which I have implemented.

Changes

The following changes have been made besides the ones mentioned in the issue.

Disposing node primitives

While Node still has methods for removing primitives like publishers they are marked as internal and are called by the Dispose methods of the primitives.
Furthermore, all dispose implementations are not thread safe which has been documented and can be changed if required.

Ros2cs class

The Ros2cs class has been split into Context which handles node creation and ManualExecutor which handles spinning.
Executors are a concept fund in other rcl client libraries and allow for multithreading and other spin implementations.
To prevent threading issues a node can have only one executor which handles all of its primitives.

Interfaces

Most of the rcl wrappers are only exposed behind an interface to allow wrappers (like the one used for Unity) to implement those interfaces which would help RobotecAI/ros2-for-unity#51.

Removal of C structs

To prevent issues with definitions becoming out of date and the GC moving memory at unexpected times most rcl structs
are hidden behind an IntPtr and only accessed via C wrappers.

Removal of disposal status tracking

Whether a wrapper has been disposed is being detected by calling the *_is_valid rcl functions or checking whether the pointer has been set to NULL.
This should prevent the status tracking from becoming out of sync with the rcl.

Guard Conditions

Guard Conditions have been implemented to allow ManualExecutor to be woken when a node requests removal.
They are internal for now since they seem to be useless without access to wait sets whcih are internal too to prevent users from putting primitives into one which combined with an active executor can lead to thread safety problems.

This PR is pretty big and can break some existing code, if it is desireable I can try to break it into smaller PRs.
The changes have been tested on Windows 10 and Ubuntu (using the changes from master) with Ros2 Humble.
I was unable to get ros2 run to find the example package, which is why the examples have to be tested by you until I discover my error.

@Deric-W
Copy link
Contributor Author

Deric-W commented Apr 14, 2023

I have managed to test the examples and they seem to be working.

@pijaro
Copy link
Collaborator

pijaro commented Apr 17, 2023

Looking great, let us review it. Thanks for the PR 😃

/// <remarks>
/// This method is thread safe.
/// </remarks>
public bool TryCreateNode(string name, out INode node)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure if we should use context to create nodes. If I'm not wrong, this method checks if the node with the current name already lives in this context. This is a weaker version of the check performed in Node.cs
by rcl_node_init - isn't it unnecessarily redundant then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context is responsible for creating nodes since it tracks all nodes associated with it to dispose them when it is disposed.
The docs say that a node with the same name would be shut down but also that node name uniqueness is not yet enforced (???).
To be sure I implemented the check to prevent removing valid nodes from the dictionary and let the user deal with name collisions since disposing a node is not thread safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran some tests with ROS2 Humble and the rcl did not complain while the old node remained valid when directly calling the Node constructor multiple times with the same name.

/// <param name="name"> Name of the node. </param>
/// <param name="context"> Context to associate with. </param>
/// <exception cref="ObjectDisposedException"> If <paramref name="context"/> is disposed. </exception>
internal Node(string name, Context context)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly connected with the previous comment - is there a reason we want to hide the Node constructor?

Copy link
Contributor Author

@Deric-W Deric-W Apr 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposing the Node constructor would allow users to create nodes without the context knowing which in turn would allow disposing the context before disposing the nodes.
Judging by this I think its not allowed.

/// <see cref="IEnumerator{bool}"/> trying to spin in each iteration
/// and yielding if a rescan had to be performed instead.
/// </returns>
public IEnumerator<bool> Spin(TimeSpan timeout)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a huge fan of the IEnumerator approach here 🤔

It would be more user-friendly to use a spin more naturally, like executor.Spin() rather than wrapping the IEnumerator with for. A default timeout could also be nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with Spin(TimeSpan) is that it would be uninterruptible since disposing can only be done when the executor is not in use.
How about SpinWhile(Func<bool> condition, TimeSpan timeout)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default arguments for TimeSpan are tricky which is why I created a overload which creates a default timeout.

@pijaro
Copy link
Collaborator

pijaro commented Apr 25, 2023

@Deric-W I started the review, but as you noticed, it is quite a lot of code 😃. Great job 👍

This PR is pretty big and can break some existing code, if it is desireable I can try to break it into smaller PRs.

I don't know if this is possible because many things are tightly connected, but if you can - smaller PRs could help a lot here.

@Deric-W
Copy link
Contributor Author

Deric-W commented Apr 28, 2023

I can try to create a separate PR for allocating the C structs in native memory, but I dont know how it will handle the possibility of primitives being disposed while being waited on (which seems possible on master).
This PR handles this case by notifying the executor (which manages the wait set) that a change occured and waiting until the change was handled (by regenerating the wait set, but the wait already ends after it is safe to assume that the wait set will be regenerated before the next use).
I dont know how master handles this problem.

@adamdbrw
Copy link
Member

adamdbrw commented May 8, 2023

@Deric-W this is a great contribution and I am looking forward to reviewing it. My apologies for the lengthy process, It has been quite a busy period for me.

Copy link
Member

@adamdbrw adamdbrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Deric-W These are great and impactful changes!
What's great:

  • Covers more ground in implementing rcl features to get ros2cs a bit closer to rclcpp in coverage. This includes things like the executor and waitset.
  • Improves interfaces and overall structure, which is excellent.
  • Removes some of the "mirror structs" for rcl which were prone to API changes when porting to new versions.
  • Exposes Context, which enables users to have more flexibility and builds ground for potentially useful configuration options such as setting different domain ids within one simulation.
  • Improves lifecycle and synchronization, especially for services.
  • Brings tests as well as useful utilities.

What's also impactful:

  • The way of using ros2cs is changed. Essentially this necessitates change in ROS 2 For Unity project which depends on ros2cs. I am not yet certain how much work it would be to adjust everythin. I am also not certain as to the degree of changes needed in R2FU applications (we have several of these ongoing) on top of that. Maybe it's not too much work, feel welcome to express an opinion on this.
  • This needs to be released with a major version uptick.
  • We need to test a lot!

We have an issue with limited resources to update the dependencies. But, these are very good changes and I would like them in. I would also like you to get proper attribution for your work, since this is quite a submission. That could include some of header attribution (as I commented) but also proper recognition in our next release.

It also makes me curious about what you are building with the library. If you would like to chat, let me know and reach out [email protected].

src/ros2cs/ros2cs_core/interfaces/IClient.cs Outdated Show resolved Hide resolved
@@ -0,0 +1,51 @@
// Copyright 2019-2021 Robotec.ai
//
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are welcome to add your Copyright underneath if you wish (perhaps for each new file with new code and substantial modification to existing files?).

/// <summary>
/// Event triggered after context shutdown before disposing nodes and finalization.
/// </summary>
event Action OnShutdown;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

@@ -0,0 +1,60 @@
// Copyright 2019-2021 Robotec.ai
Copy link
Member

@adamdbrw adamdbrw May 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Executor and WaitSet in particular are new (useful) features for ros2cs. It would be perfect to separate it out to another PR. Would it be something hard to accomplish?

Update: this is not necessary for acceptance of this PR, I see this could mean quite some extra work for you.

// everything is disposed when disposing the context
using Context context = new Context();
using ManualExecutor executor = new ManualExecutor(context);
context.TryCreateNode("listener", out INode node);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shows changes in how the library is used. R2FU has to be adjusted (and perhaps, applications as well).
Taking in this PR warrants a major version upgrade for the next release.

Copy link
Contributor Author

@Deric-W Deric-W May 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can work on updating R2FU, but I think I have to do some changes since some features require destructors touching other objects.
ROS2ForUnity, ROS2UnityComponent and ROS2UnityCore can probably be converted to Context wrappers with some small changes while ROS2Node will loose his main feature of being collectable by the GC.
To compensate there coud be a variant which is a MonoBehavior and uses the callbacks provided by Unity like OnDestroy().

@Deric-W
Copy link
Contributor Author

Deric-W commented Jul 7, 2023

While working on updating R2FU I noticed that currently GuardCondition and WaitSet use IContext.OnShutdown to automatically dispose themself while it is documented that this happens after all callbacks have been invoked, making it difficult to do things like stopping a background thread using this event.
A possible fix is to handle them like nodes, which should change nothing besides the time at which they are disposed by the context.

Deric-W and others added 19 commits August 16, 2023 14:33
Executors should allow for more
flexibility and encapsulation.

 create mode 100644 src/ros2cs/ros2cs_core/interfaces/IExecutor.cs
 - Context is a non-static version of Ros2cs without executor functionality
 - Context and Node are now sealed
 - Node is now internal and is preventing its Context from being collected by the GC
 - not calling Dispose may now leak resources since internal collections may be finalized in the finalizer
 - sizes of rcl_node_t and rcl_conext_t are now invisible to C# code

 create mode 100644 src/ros2cs/ros2cs_core/Context.cs
 delete mode 100644 src/ros2cs/ros2cs_core/Ros2cs.cs
 create mode 100644 src/ros2cs/ros2cs_core/interfaces/IContext.cs
 create mode 100644 src/ros2cs/ros2cs_core/utils/MappingValueView.cs
This should prevent misunderstandings regarding being
in an disposed state and being disposed successfully.
 - is now sealed and internal
 - rcl_publisher_t size is now invisible to C# code
 - removes himself from node on dispose
 - prevents Node from being collected by the GC
This interface contains all methods necessary
for executors to process work.
The async version is equivalent to the non-async version
and an optimization opportunity for task based executors.

 create mode 100644 src/ros2cs/ros2cs_core/interfaces/IWaitable.cs
 - is now sealed and internal
 - rcl_subscription_t size is now invisible to C# code
 - removes himself from node on dispose
 - prevents Node from being collected by the GC
 - is now sealed and internal
 - rcl_service_t size is now invisible to C# code
 - removes himself from node on dispose
 - prevents Node from being collected by the GC
 rename src/ros2cs/ros2cs_core/utils/{MappingValueView.cs => MappedValueDictionary.cs} (87%)
 create mode 100644 src/ros2cs/ros2cs_core/utils/LockedDictionary.cs
 - is now sealed and internal
 - rcl_client_t size is now invisible to C# code
 - removes himself from node on dispose
 - prevents Node from being collected by the GC
 - reduced code duplication
These assertions are only enabled in debug mode
and should make finding bugs easier.
This should prevent the GC from starting
finalization while a native call is running.
This enforces more encapsulation than direct internal access.
Make setting `INode.Executor` the task of the executor
since `INode.SwapExecutor` is performing two actions at once
(adding and removing) which can complicate error handling.
Furthermore, `IExecutor.Wake` is split to allow notifying
executors that changes occurred without being forced to wait.
Returning false could fail since the size
of a bool is different between C, C++ and C#.
This is fixed by adding wrappers which assure that
a byte is returned containing 1 for true and 0 for false.
Furthermore this commit adds missing attributes to ref
parameters in NativeRcl or turns them into out parameters.
This allows wrappers to be put in a wait set by delegating to the wrapped implementation.
Deric-W added 26 commits August 16, 2023 14:34
 - allow for checking which resources became ready
 - implement IExtendedDisposable and IReadOnlyCollection interfaces
 - abstract away wait set resizing and filling
 - hide struct behind IntPtr and C wrappers to handle layout changes and GC moves
It is currently only internal and used for the executor.
 - implementation of IExecutor which has to be spun manually
 - roughly the same as the old spin logic in Ros2cs
 - does not rescan every spin

 create mode 100644 src/ros2cs/ros2cs_core/executors/ManualExecutor.cs
 create mode 100644 src/ros2cs/ros2cs_tests/src/ManualExecutorTest.cs
Guard conditions seem to stay triggered until waited on.
The checks are already done by rcl.
Creating nodes is now thread safe.
The class is now public to allow users to read
the implementation documentation like thread safety.
To prevent access to internal methods the class remains sealed
and uses explicit interface implementations.
The class is now public to allow users to read
the implementation documentation like thread safety.
To prevent access to internal methods the class remains sealed
and uses explicit interface implementations.
The class is now public to allow users to read
the implementation documentation like thread safety.
To prevent access to internal methods the class remains sealed
and uses explicit interface implementations.
The class is now public to allow users to read
the implementation documentation like thread safety.
To prevent access to internal methods the class remains sealed
and uses explicit interface implementations.
The class is now public to allow users to read
the implementation documentation like thread safety.
Furthermore, the primitive collections (and by extension their methods)
are now thread safe.
Async operations are out of scope for this branch.
This commit prevents node primitives from
waiting on the wrong executor while being disposed.
The results are now only exposed as `IEnumerable`s.
This should be more user friendly than `ManualExecutor.Spin`.
Files created by this PR are under the copyright
of ADVITEC Informatik GmbH as suggested by review.
The license is the same as existing files.
This prevents violating the documentation.
This prevents violating the documentation.
To prevent users from having to implement
the feature themselves and provide an alternative
to the old spinning behaviour.
The new executor uses a long running `Task` instead of a `Thread`
to provide better integration with the C# ecosystem.
Threads may fail to wake for some time since the method waits
on a `ManualResetEventSlim` which is reset before the next spin.
To prevent such cases the waiting condition has been modified
to additionally check an integer which is incremented after every spin
to detected if the spin which was waited upon has ended.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants