Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Progress-bar for ros2 bag play #1836

Open
wants to merge 17 commits into
base: rolling
Choose a base branch
from

Conversation

nicolaloi
Copy link
Contributor

@nicolaloi nicolaloi commented Oct 15, 2024

Closes #1762: Adds a progress bar for ros2 bag play, showing the bag time and duration, similar to what is seen in ROS1.

In ROS1, the progress bar was easy to handle because the rosbag play command didn't print additional output during playback. However, in ROS2, many logs can be printed during playback (e.g., keyboard control logs).

The progress bar is printed and continuously updated on the last line of the playback output. I have tested this initial implementation with keyboard controls (play, pause, play next, set rate), Ctrl+C, throwing an exception, and the flags --log-level debug, --delay X, and --start-offset X.

To try it, use the new --progress-bar flag when playing a bag.

@nicolaloi
Copy link
Contributor Author

nicolaloi commented Oct 15, 2024

video-example.mp4

(at 2.6-2.7 seconds of bag duration, the progress bar is slightly updating while in paused state because I was clicking the left arrow for the play next keyboard control)

Since ros2 bag play generates more logs than its ROS1 counterpart, the key aspect of this PR is to print a string shorter than any potential log message not coming from rosbag2_player. For example, with the current draft of the progress bar, this prints nicely:

[INFO] [1729028986.845909253] [rosbag2_player]: Pausing play.
 Bag Time 1718553782.525239  Duration 1.272325/6.974378 [P]

while this longer version of the progress bar doesn't work properly without extra line clearing (in this case, it's easy to clear the line since the logs come from rosbag2_player., but for "external" logs, this becomes more difficult):

[INFO] [1729028986.845909253] [rosbag2_player]: Pausing play.us:Running]
 Bag Time 1718553782.525239  Duration 1.272325/6.974378 [status:Paused]

@MichaelOrlov looking forward to your feedback, for now I have made a test MVP code to check the feasibility of the feature.

Signed-off-by: Nicola Loi <[email protected]>
@nicolaloi nicolaloi force-pushed the nicolaloi/bag-play-progress-bar branch from d503e5f to 2cb732d Compare October 15, 2024 22:24
@nicolaloi nicolaloi changed the title Start adding progress-bar for ros2 bag play [wip] Progress-bar for ros2 bag play Oct 16, 2024
@nicolaloi nicolaloi force-pushed the nicolaloi/bag-play-progress-bar branch from b815f9e to 4ba1a8c Compare October 20, 2024 22:56
@nicolaloi nicolaloi changed the title [wip] Progress-bar for ros2 bag play Progress-bar for ros2 bag play Oct 21, 2024
Copy link
Contributor

@MichaelOrlov MichaelOrlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nicolaloi Thank you for your contribution.
I have a few concerns about the design and implementation.
First, we need to limit the frequency with which we will print in std::cout. The best way to do this is to create a local variable, aka size_t progress_bar_update_freq_ (in Hz).
e.g. 1 Hz = once per/sec, 2 Hz = twice per second, 3 = three times per second etc..
Then, check the delay between the last call in the print_status(..). If the delay is more than the threshold, print; otherwise, skip.
We also need to move the check for the if (disable_progress_bar_) inside print_status(..) to make the logic cleaner and in one place.
Calling double current_time_secs = RCUTILS_NS_TO_S(static_cast<double>(clock_->now())); each time in the print_status(..) is very inefficient and expensive since we are locking mutex inside and transorming current time to the ROS time. We actually know the current ROS time at which we are playing. This is a timestamp from the message. I would suggest passing it to the print function as a parameter.
i.e., add tiemstamp parameter to the print_status(..) function and use message_ptr->recv_timestamp as a source of timestamp.

I found the implementation of the print_status(..) function a bit messy.
That is a quintessence of the modern C++17 with templates and old style plain C code.
Need to avoid using C code. Please use C++ alternatives. I am not sure if we really need to flush buffer each time. Flush is an expensive operation since this is a blocking system call. Need to avoid it if possible.

IMO templating print_status(..) function is overengineering and make code less readable and less understandable. The cleaner solution would be if using enum class values for different modes. In this case all extra wrappers like check_and_print_status() will not be needed. We will be able to call print_status(PrintStatus::Paused) directly from where we needed it.

rosbag2_transport/src/rosbag2_transport/player.cpp Outdated Show resolved Hide resolved
@nicolaloi
Copy link
Contributor Author

@MichaelOrlov thanks for the comments.

  • limit printing frequency, mutex and flushing: yes I already had a local change to allow the user to choose the update frequency from the CLI (default 10-20 Hz) to reduce the mutex locking and output stream flushing. But indeed using the timestamp of the last published message is better than calling clock_->now().
    -- The rationale of clock_->now() was to be able to have a consistent time flow if you want to print the progress bar at a fixed rate but messages are not close in time. For example, if the next message is 3 seconds in the future, for the next 3 seconds the progress bar time will appear frozen since it is printing the same timestamp. But yes in practice is a niche case.
    -- Regarding the flush, updating the progress bar without flushing will not update the printed time (or anyway it will get flushed automatically after some update calls, but there can be artifacts if other external ros logs are being printed). So instead of calling the progress bar without flushing, I would just reduce the calls to the progress bar printing with the already mentioned user-defined frequency update. I would say that 10-20 Hz can be a good default value? Or even lower to be conservative?
  • add timestamp parameter to the print_status: for this I would save it as a class attribute and use it within print_status instead of adding it as an input, since in many instances where print_status is being called there is no direct easy access to the last message.
  • print_status implementation: ok I will remove the template and use enums.

Should the progress bar be disabled by default, or enabled by default with a default low update frequency?

@nicolaloi nicolaloi force-pushed the nicolaloi/bag-play-progress-bar branch 2 times, most recently from 253256a to 44fa94a Compare October 28, 2024 23:35
nicolaloi and others added 2 commits October 28, 2024 23:38
Signed-off-by: Nicola Loi <[email protected]>
Signed-off-by: Nicola Loi <[email protected]>
@nicolaloi nicolaloi force-pushed the nicolaloi/bag-play-progress-bar branch from 44fa94a to 43fa0f8 Compare October 28, 2024 23:38
@nicolaloi
Copy link
Contributor Author

I have made changes as discussed.

Linked modifications:

I have modified the play_next function to accept a boolean input argument 'is_burst_mode' (default to false); so when play_next is called in burst mode, the print_status is not called at all, to avoid any delay (but now the progress bar will freeze during burst).

print_status will update the progress bar at the defined update frequency if the status is RUNNING; for the other states (or if the new boolean argument 'force_update' is true) it will update it disregarding the update frequency. This is to take care of the istances where a new log message from the rosbag player is being printed (like 'rate changed to 1.1', 'player paused', warnings..., anyway these are not high-frequency logging) to make the progress bar visible again. Otherwise, it will temporarily or permanently (permanently if the player gets paused) disappear since it is overwritten in the terminal by the previous log message.

@MichaelOrlov
Copy link
Contributor

@nicolaloi I have not had a chance yet to look deeply into the code after your latest changes. However, I have a few comments.

  1. I am ok with a flush; I anticipated that it is likely some constraints as you described.
  2. Making the progress bar enabled by default is a good idea. However, the frequency should be of integer type (size_t) in C++, and 10 Hz IMO is to fast. I would put 3-5 Hz at max by default.
  3. We don't need flag play_options.disable_progress_bar at all we can rely on the play_options.progress_bar_update rate. i.e. if it is 0 = disabled.

@nicolaloi
Copy link
Contributor Author

Ok, I'll set the default rate to 3 Hz.

For 2nd and 3rd points, there is no longer the --disable_progress_bar option, but a --progress_bar option that saves the frequency rate as a double, and if the value is ==0 the progress bar is disabled, and if <0 it is updated for every message. So the user has the extra option to enable it for each message if they need to, without having to set a large enough frequency rate. If it's not okay I'll change to size_t as you suggested.

Signed-off-by: Nicola Loi <[email protected]>
@nicolaloi nicolaloi force-pushed the nicolaloi/bag-play-progress-bar branch from eb9ea35 to 9cdf5ad Compare November 10, 2024 22:56
@nicolaloi
Copy link
Contributor Author

I have realized the newly added variables progress_bar_last_time_printed_ and recv_timestamp_last_published_ in commit 43fa0f8 could have synchronization issues since the function print_progress_bar_status can be called contemporaneously from a service callback (like /play, /pause, /play_next), I need to take a better look at them.

@nicolaloi
Copy link
Contributor Author

nicolaloi commented Nov 18, 2024

@MichaelOrlov So, to avoid synchronization issues without using mutexes, etc to not increase overhead, the variable progress_bar_last_time_printed_, which tracks the last update time of the progress bar, is now accessed exclusively within the specialized function update_progress_bar_after_message_published. This function is called in code sections that are already protected by an existing mutex (skip_message_in_main_play_loop_mutex_). The progress bar updates at the specified rate only during the "standard" and recurrent event of a message being published.
For all other "extraordinary" and less common events occurring in separate threads (user triggers pause, resume, change rate, play next, etc, via keyboard controls or services) the function update_progress_bar is used. To avoid synchro issues this function does not access progress_bar_last_time_printed_. It will always print the progress bar so it appears again immediately after the new log messages; however, it does not update progress_bar_last_time_printed_. If I can use a mutex or an atomic int64, I can update progress_bar_last_time_printed_ for these "extraordinary" events as well (i.e. ensuring the progress bar timing accounts for these extraordinary updates), but I don't think is needed given they are not common.

Regarding how to print the current timestamp: when a message is published, the update_progress_bar_after_message_published function is invoked with the message's timestamp as input as suggested. However, for the "extraordinary" events (user triggers pause, resume, change rate, play next, etc, via keyboard controls or services) where the timestamp is not directly available, the saved timestamp of the latest published message (recv_timestamp_last_published_) is not used anymore, since it could have synchro issues, but clock_->now() is used. This locks the inner mutex of clock_, but these events are very rare (or may not occur at all) compared to message publishing.

Extras:

  • fixed printed timestamp when extarordinary events occur while in delayed status (used an atomic bool, but it is accessed only for these rare extraordinary events), related to Rosbag2 play --clock option ignores delay #1858 (clock_ is not paused when in delayed status);
  • added burst status [B], so in case of a large number of messages being burst, the progress bar will not be frozen at [P]aused status. It will be frozen at [B]urst status, with the [B] conveying the information that messages are currently being burst (progress bar timestamp is frozen because normal continuous updates are blocked during burst to avoid delays);
  • reduced the fixed precision of the displayed bag duration, from microseconds to milliseconds (Duration 1.862/74.434), to reduce the overall length of the progress bar string. Anyway, the displayed bag time (Time 1718553783.196209) has still microsecond precision.

@nicolaloi nicolaloi marked this pull request as ready for review December 4, 2024 23:06
@MichaelOrlov
Copy link
Contributor

@nicolaloi I have not had a chance yet to make a new full round of review. However, I have a comment about reducing the precision for output bag duration.
We used to use sec.nanosec format for timestamps in the bag and this is an "unified" format for other tools in ROS 2.
IMO make sense to use it without reduction to the milliseconds or nanoseconds. Please see example in the

template<>
struct convert<rclcpp::Duration>
{
static Node encode(const rclcpp::Duration & duration)
{
Node node;
node["sec"] = duration.nanoseconds() / 1000000000;
node["nsec"] = duration.nanoseconds() % 1000000000;
return node;
}
static bool decode(const Node & node, rclcpp::Duration & duration)
{
duration = rclcpp::Duration(node["sec"].as<int32_t>(), node["nsec"].as<uint32_t>());
return true;
}
};

The possible use case is that the user may stop playback at the point of interest and then copy past the current timestamp, for instance, to the Foxglove Studio to be able to open the same bag at the current location for further introspection.

@nicolaloi
Copy link
Contributor Author

I wasn't using nanoseconds because the problem with them is that the progress bar message will be longer than other log messages, and artifacts will appear:

the key aspect of this PR is to print a string shorter than any potential log message not coming from rosbag2_player.

Using nanoseconds for example:

[INFO] [1733429800.530785030] [rosbag2_player]: Pausing play.13 [R]    
[INFO] [1733429801.732687906] [rosbag2_player]: Set rate to 1.1 [P]    
 Bag Time 1718553785.447256804 Duration 4.194342136/6.974378413 [P] 

Some digits and the final [P] will stay at the end of previous log messages.

But if nanoseconds for both bag time and bag duration are a must, then I can make space for their nanosecond digits by

  • reducing the progress bar string by removing the anyway static ending time info from the progress bar (can be printed before, together with the initial log info messages of the player);
  • reducing the progress bar string by removing/shortening the "Bag Time" and "Duration" words.

So something like (printing an explanation for t: and d: at the beginning, together with the ending time info):

[INFO] [1733433036.847650453] [rosbag2_player]: Resuming play.
 Bag t: 1718553786.279276609 d: 5.026361942 [R]

I will push this now.

@MichaelOrlov
Copy link
Contributor

@nicolaloi But why ??

the key aspect of this PR is to print a string shorter than any potential log message not coming from rosbag2_player.

Why not have a longer length, whatever we need, and whatever will be readable?

@MichaelOrlov
Copy link
Contributor

Can we print a separator line in between? Like

[INFO] [1733429800.530785030] [rosbag2_player]: Pausing play.13 [R]    
[INFO] [1733429801.732687906] [rosbag2_player]: Set rate to 1.1 [P]    
===================================================================
 Bag Time 1718553785.447256804 Duration 4.194342136/6.974378413 [P] 

@nicolaloi
Copy link
Contributor Author

nicolaloi commented Dec 6, 2024

I am using short strings to try to avoid as much as possible 'poisoning' the log messages of other nodes that could print on stdout on the same terminal, given that at the end of the progress bar there is a carriage return '\r'. For example, a long string like:

`Bag Time 1718553785.447256804 Duration 4.194342136/16.974378413 [R]` 

will be overwritten (due to `\r') by this external log message from the user in the same terminal:

[INFO] [1733429801.732687906] [my_node] SLAM i 7, inliers: 25

into

[INFO] [1733429801.732687906] [my_node] SLAM i 7, inliers: 25413 [R]

And now you cannot know if the inliers were 2, 25, 254, 2541, or 25413.
While with a shorter progress bar string:

[INFO] [1733429801.732687906] [my_node] SLAM i 7, inliers: 25
 Bag t: 1718553785.447256804 d: 16.194342136 [R]

Then, there can always be shorter strings than any possible progress bar, especially if the user just prints using std::cout and not RCLCPP_INFO_STREAM. So this problem could happen whatever the progress bar length is, but using a shorter one makes it less likely. Especially, I was trying to have a progress bar at least shorter than any log message that could be printed by rosbag2_player, so the various "Set rate to 1.2", "Pausing play", etc.. printed by the rosbag2_player would not become something like Pausing play.13 [R].

This was my reason for using short progress bar strings, but if these are not issues or anyway something that the progress bar should not take care of, then yes I can push a progress bar with the long version.

Regarding the separator line, I can try to think about it, maybe it is feasible using only raw ANSI escape codes, maybe it is difficult to implement without side-effects or without using a terminal control library like curses. Especially considering that you don't have control or knowledge of all the messages that could be printed in the terminal.

@MichaelOrlov
Copy link
Contributor

@nicolaloi As far as I understand you are warring about output std::cout output mashup due to the calling it from the multiple threads.
I found some explanation on https://stackoverflow.com/questions/14718124/how-to-easily-make-stdcout-thread-safe

//Thread 1
std::cout << "the quick brown fox " << "jumped over the lazy dog " << std::endl;

//Thread 2
std::cout << "my mother washes" << " seashells by the sea shore" << std::endl;

//Could just as easily print like this or any other crazy order.
my mother washes the quick brown fox seashells by the sea shore \n
jumped over the lazy dog \n

The proposed solution is simply compose your string before you pass it to cout

For example.

//There are other ways, but stringstream uses << just like cout.. 
std::stringstream msg;
msg << "Error:" << Err_num << ", " << ErrorString( Err_num ) << "\n"; 
std::cout << msg.str();

This way your stings can't be garbled because they are already fully formed, plus its also a better practice to fully form your strings anyway before dispatching them.

@nicolaloi
Copy link
Contributor Author

nicolaloi commented Dec 6, 2024

I am already composing the string with an ostringstream before printing it.
What I meant is related to the carriage return: the progress bar ends with a \r, so the next progress bar / rosbag_player log message / other nodes log message will be printed on the same line, not on a new line, effectively overwriting the current line, where the progress bar is. But the new message can overwrite only the first len(new message) characters. If the new log message is longer than the progress bar, the last len(new message)-len(progress bar) characters will stay in the terminal at the end of the new log message, like the example from my previous message.
This could be solved by clearing the line with ANSI code before printing any new log message, but you need to modify all the code that prints to terminal. Which can be done for every RLCPP_X_STREAM within the player.cpp file, but you miss all the RCLCPP_X_STREAM from the files imported by player.cpp, and all the prints effectuated by the user with their nodes:

while this longer version of the progress bar doesn't work properly without extra line clearing (in this case, it's easy to clear the line since the logs come from rosbag2_player., but for "external" logs, this becomes more difficult):

@MichaelOrlov
Copy link
Contributor

@nicolaloi Ideally it would be nice to have something similar to the https://github.com/xqms/rosbag_fancy

@xqms Since you are the author of `rosbag_fancy´, your help with this PR and/or ideas on how to output the status bar without smashing other log messages will be very appreciated.

@MichaelOrlov
Copy link
Contributor

@nicolaloi In a short term I agree to shorten the status string to something like

[1718553785.447256804] Duration 4.194342136/16.974378413 [R]

i.e. put the current timestamp into the square brackets [ ] to save space in line, and/or try to use a delimiter line.

@MichaelOrlov
Copy link
Contributor

@nicolaloi General comment: Could you please rename the term print frequency and other "frequency" names to the progress bar update rate?
The frequency and update rate are synonyms in this case.
However, frequency is a more scientific term, while the update the rate with times per second would be more user-friendly and easily understandable for everyone.

@nicolaloi
Copy link
Contributor Author

Ok based on the comments, I will change the progress bar string to [1718553785.447256804] Duration 4.19/16.97 [R]

@nicolaloi nicolaloi force-pushed the nicolaloi/bag-play-progress-bar branch from 6a3cbc3 to fb464b0 Compare December 6, 2024 21:42
@nicolaloi
Copy link
Contributor Author

force pushed since I forgot DCO in a previous commit

@xqms
Copy link

xqms commented Dec 7, 2024

Hey, here's how rosbag_fancy does it:

  1. Print a few newline characters (completely clearing the line each time using ANSI control codes). This leaves space for other log messages so that they do not clobber the progress bar / UI
  2. Print the progress bar (completely clearing the line each time)
  3. Move the cursor back up to the blank lines so that log messages will appear there.

This also ensures that the progress bar is always at the bottom.

Note that this approach assumes that you update the UI more often than log messages appear.

Here's a link to the last step in the code: https://github.com/xqms/rosbag_fancy/blob/1f5ec4d172418a44ee206749182d0bec7cc553f0/rosbag_fancy/src/ui.cpp#L334

However, rosbag_fancy does not have to worry about multi-threading.

@MichaelOrlov
Copy link
Contributor

@nicolaloi Could you please rebase your branch and address conflicts one more time?
We pushed another PR recently which overlaps a bit with changes.

@nicolaloi
Copy link
Contributor Author

nicolaloi commented Dec 8, 2024

@xqms thanks for the suggestions. Adding extra lines is indeed a good idea to mitigate the problem. External user's log messages shorter than the progress bar string can still be smashed, but the extra lines give more safety margin.

Note that this approach assumes that you update the UI more often than log messages appear.

Due to this, I have added an extra option, --progress-bar-separation-lines (default to 2), so for these cases the user can increase the amount of blank lines. I have implemented this in the code so it can be tested. Here's a video showing 3 external user's log messages (with the second one being a short message) being printed between two progress bar updates, testing with 4,3,2,1, and 0 separation lines:

video_example.mp4

From the video, you can see that with 3 external log messages (with one being a short message), you could encounter some issues with 2,1, and 0 separation lines.

General rule: if between two progress bar updates you are expecting N external log messages and at least one is shorter than the progress bar strings, then you should set at least N --progress-bar-separation-lines. Or, you should increase the progress bar update rate.

However, if the external log messages are longer than the progress bar strings, or there will be less than "default separation lines" (2) external log messages between two progress bar updates, then everything is ok by default. Moreover, this is an issue only if the user has his own log messages on the same output of the playback player and the progress bar is enabled.

@MichaelOrlov I have also added the separator line (=======)

Signed-off-by: Nicola Loi <[email protected]>
@nicolaloi
Copy link
Contributor Author

friendly ping

@MichaelOrlov
Copy link
Contributor

@nicolaloi Sorry for the delay with the review.
Overall, while the functionality is working, the code quality is still far from the production level and needs to be improved/redesigned. Also lack of unit tests.
I will provide more details in a couple of days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add progress bar and timestamp when playing back a bag
4 participants