RabbitMQ Streams timestamp incorrect value in delivered message chunk #3775

wrobell · 2021-11-18T23:29:03Z

wrobell
Nov 18, 2021

First check timestamp of messages

Start RabbitMQ broker.
Store some messages in a stream.
Receive the messages - timestamp is correct.

Now, put laptop to sleep and without restarting the RabbitMQ broker

Store new messages in the stream.
Receive the messages - timestamp is incorrect. It is shifted by about the time the laptop slept.

Restart Rabbit MQ broker

Store new messages in the stream.
Receive the messages - timestamp is correct.

I would expect the timestamp to always match the actual time.

lhoguin · 2021-11-19T09:56:18Z

lhoguin
Nov 19, 2021
Maintainer

Hello!

RabbitMQ is built using Erlang/OTP. Erlang/OTP is built for server applications, devices that do not go to sleep. That said, not all is lost!

By default Erlang will do time correction, meaning the Erlang system time slowly moves toward the OS system time. If you continue storing and reading messages, it will at some point catch up with the real time. But this can take a while.

By default Erlang does NOT do time warps. This is what you would expect in your case. RabbitMQ is not configured for time warps either. With time warps, Erlang can immediately correct time when your laptop wakes up from sleep. You can find more information here: https://www.erlang.org/doc/apps/erts/time_correction.html#time-warp-modes - and this page has all the details for time correction if you are curious.

If you want to use time warps, you can export RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS and set +C multi_time_warp for example.

0 replies

michaelklishin · 2021-11-19T10:36:27Z

michaelklishin
Nov 19, 2021
Maintainer

I will convert this issue to a GitHub discussion. Currently GitHub will automatically close and lock the issue even though your question will be transferred and responded to elsewhere. This is to let you know that we do not intend to ignore this but this is how the current GitHub conversion mechanism makes it seem for the users :(

0 replies

michaelklishin · 2021-11-19T10:37:21Z

michaelklishin
Nov 19, 2021
Maintainer

Putting laptop to sleep or messing with OS time in other way will affect the absolute majority of data services in hard to predict ways.

0 replies

wrobell · 2021-11-19T12:01:37Z

wrobell
Nov 19, 2021
Author

If I am not mistaken, the timestamp used here is not Posix time then.

The following RabbitMQ documentation suggests it is Posix though:

1 reply

kjnilsson Nov 19, 2021
Maintainer

It is the erlang runtime's view of Posix time: https://www.erlang.org/doc/apps/erts/time_correction.html#Erlang_System_Time

wrobell · 2021-11-19T21:43:36Z

wrobell
Nov 19, 2021
Author

It seems that we cannot rely on using Posix time to fetch data from a stream. Also the client has no way of querying the offset timestamp. Offset timestamp needs to be stored client-side and then used on restart. But this can be also achieved with offset value. It looks like duplicated feature?

0 replies

kjnilsson · 2021-11-19T21:46:46Z

kjnilsson
Nov 19, 2021
Maintainer

Not at all. The offset is precise. The timestamp is for approximate “read from 10 minute ago” type of use cases. If you need precision store the last offset you processed for later resumption.

On Fri, 19 Nov 2021 at 21:43, wrobell ***@***.***> wrote: It seems that we cannot rely on using Posix time to fetch data from a stream. Also the client has no way of querying the offset timestamp. Offset timestamp needs to be stored client-side and then used on restart. But this can be also achieved with offset value. It looks like duplicated feature? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3775 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJAHFCXUNAJEJNYKNG3BMLUM3AJHANCNFSM5IL2FOSA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

-- *Karl Nilsson*

1 reply

wrobell Nov 20, 2021
Author

I am not sure offset feature can be called precise from the protocol point of view. For example

producer sends messages one-by-one and client requests message with offset 151, then we get message with offset 151 indeed
producer sends messages in batches of 100, requesting message with offset 151 we get first message with offset 100 and you still need to do filtering or deduplication
finally, if you query offset for non-existing reference we will receive 0; it seems we cannot know if first message is processed or not by using rabbitmq mechanism for offset storage

Considering the problems above, timestamp feature seems equivalent.

acogoluegnes · 2021-11-22T09:01:33Z

acogoluegnes
Nov 22, 2021
Maintainer

producer sends messages in batches of 100, requesting message with offset 151 we get first message with offset 100 and you still need to do filtering or deduplication

That's right, the client library needs to keep a reference of the requested offset and filter out the first messages of the first chunk. The library handles this usually, the application developer does not have to worry about it.

0 replies

kjnilsson · 2021-11-22T10:03:39Z

kjnilsson
Nov 22, 2021
Maintainer

"Considering the problems above, timestamp feature seems equivalent" The internal batching that goes on is purely an implementation detail and isn't actually based on the producer batches even if in isolated experiments it may seem so. That the client has to filter out messages is also an implementation detail and something (smart) clients just have to do. Offsets are guaranteed to be incrementing, timestamps are not, even if we used monotonic time we may change the stream leader and take the next timestamp on a different server with a different time drift. Hence they cannot be considered equivalent. Offsets should be used to resume processing at a known point. Timestamps are used when providing an approximate time specification, such as "5 minutes ago" (5m). That a non-existent consumer id offset query returns 0 is a bit unfortunate as it isn't clear whether the consumer id was found or it was actually 0. We may look at changing it to return a different response code in this case.

…

On Mon, 22 Nov 2021 at 09:01, Arnaud Cogoluègnes ***@***.***> wrote: - producer sends messages in batches of 100, requesting message with offset 151 we get first message with offset 100 and you still need to do filtering or deduplication That's right, the client library needs to keep a reference of the requested offset and filter out the first messages of the first chunk. The library handles this usually, the application developer does not have to worry about it. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3775 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJAHFACH7B6M5FJGL3GSCLUNIBHPANCNFSM5IL2FOSA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

-- *Karl Nilsson*

0 replies

acogoluegnes · 2021-11-22T10:36:54Z

acogoluegnes
Nov 22, 2021
Maintainer

Follow-up issue when no offset is stored: #3783.

0 replies

wrobell · 2021-11-22T18:19:46Z

wrobell
Nov 22, 2021
Author

Thanks for all the information and creating the bug for zero offset issue.

Is there a reason why RabbitMQ shifts the responsibility of being "smart" to a client when offset is in the middle of a block? There are always multiple client libraries, so it seems it could be better to perform offset filtering RabbitMQ side?

Regarding the timestamp. I would like to create pull request for RabbitMQ Streams protocol documentation to add the link to Erlang system time documentation. Would that be accepted?

Actually, I would avoid using terms like "posix-ish" as this suggests it is OS time, I believe. For example, time can be changed after RabbitMQ restart and request to get data from last 5 minutes is not going to work until Erlang's system time catches up.

0 replies

acogoluegnes · 2021-11-23T08:03:49Z

acogoluegnes
Nov 23, 2021
Maintainer

Is there a reason why RabbitMQ shifts the responsibility of being "smart" to a client when offset is in the middle of a block? There are always multiple client libraries, so it seems it could be better to perform offset filtering RabbitMQ side?

The protocol network adapter (stream plugin) delivers chunks of messages using send_file. It does not know about the structure of those chunks (header + messages), the client does. Delivering a partial chunk would couple the stream plugin to the storage format and require some costly processing on the server side. And the partial chunk would not have the header, which contains critical information (like... the number of messages and the length of the data, for parsing).

Regarding the timestamp. I would like to create pull request for RabbitMQ Streams protocol documentation to add the link to Erlang system time documentation. Would that be accepted?

We would consider it, for sure.

0 replies

michaelklishin · 2021-11-23T08:12:33Z

michaelklishin
Nov 23, 2021
Maintainer

I'm not convinced we should change documentation because of a use case no realistic piece of data service experiences outside of very specific circumstances (vMotion and the likes) in production. We already mention time clock synchronisation as important between cluster nodes. System time jumping around is going to affect virtually every piece of software, and Erlang's behavior when time shifts underneath the runtime is reasonable.

3 replies

wrobell Nov 23, 2021
Author

For example, Raspberry Pi needs to use NTP synchronization to get proper time (no RTC).

To allow correct timestamp based streams request, RabbitMQ cannot be started before time is synchronized. IMHO, documentation needs to be very explicit to avoid confusion.

michaelklishin Nov 24, 2021
Maintainer

In practice, any system provisioned must run NTP or something similar as soon as there are multiple cluster nodes. And with most deployment tools, you either force an NTP sync or rely on it to have happened before e.g. RabbitMQ starts, and most of the time it does.

michaelklishin Nov 24, 2021
Maintainer

Sure, we can document this and no one will read that documentation before they run into this obscure edge case. It's not even clear what doc guide this should go into (we already mention the importance of time synchronisation in IIRC the Clustering guide because it is most relevant in that context).

michaelklishin · 2021-11-24T09:34:55Z

michaelklishin
Nov 24, 2021
Maintainer

I feel this conversation has run its course. Coming up with examples of increasingly more obscure environments (sorry but most people do not run production data services on Raspberry Pis or laptops that are put to sleep) won't get us anywhere.

I'll make sure that our clustering guide does mention time synchronisation between nodes but beyond that, time monotonicity is a topic that's important for so many features that we would be forced to add it to every guide. Or we can just rely on developer and operator common sense around this.

0 replies

RabbitMQ Streams timestamp incorrect value in delivered message chunk #3775

Uh oh!

wrobell Nov 18, 2021

Replies: 13 comments · 5 replies

Uh oh!

lhoguin Nov 19, 2021 Maintainer

Uh oh!

michaelklishin Nov 19, 2021 Maintainer

Uh oh!

michaelklishin Nov 19, 2021 Maintainer

Uh oh!

wrobell Nov 19, 2021 Author

Uh oh!

kjnilsson Nov 19, 2021 Maintainer

Uh oh!

wrobell Nov 19, 2021 Author

Uh oh!

kjnilsson Nov 19, 2021 Maintainer

Uh oh!

wrobell Nov 20, 2021 Author

Uh oh!

acogoluegnes Nov 22, 2021 Maintainer

Uh oh!

kjnilsson Nov 22, 2021 Maintainer

Uh oh!

acogoluegnes Nov 22, 2021 Maintainer

Uh oh!

wrobell Nov 22, 2021 Author

Uh oh!

acogoluegnes Nov 23, 2021 Maintainer

Uh oh!

Uh oh!

michaelklishin Nov 23, 2021 Maintainer

Uh oh!

wrobell Nov 23, 2021 Author

Uh oh!

Uh oh!

michaelklishin Nov 24, 2021 Maintainer

Uh oh!

michaelklishin Nov 24, 2021 Maintainer

Uh oh!

michaelklishin Nov 24, 2021 Maintainer

wrobell
Nov 18, 2021

Replies: 13 comments 5 replies

lhoguin
Nov 19, 2021
Maintainer

michaelklishin
Nov 19, 2021
Maintainer

michaelklishin
Nov 19, 2021
Maintainer

wrobell
Nov 19, 2021
Author

kjnilsson Nov 19, 2021
Maintainer

wrobell
Nov 19, 2021
Author

kjnilsson
Nov 19, 2021
Maintainer

wrobell Nov 20, 2021
Author

acogoluegnes
Nov 22, 2021
Maintainer

kjnilsson
Nov 22, 2021
Maintainer

acogoluegnes
Nov 22, 2021
Maintainer

wrobell
Nov 22, 2021
Author

acogoluegnes
Nov 23, 2021
Maintainer

michaelklishin
Nov 23, 2021
Maintainer

wrobell Nov 23, 2021
Author

michaelklishin Nov 24, 2021
Maintainer

michaelklishin Nov 24, 2021
Maintainer

michaelklishin
Nov 24, 2021
Maintainer