Replies: 13 comments 5 replies
-
Hello! RabbitMQ is built using Erlang/OTP. Erlang/OTP is built for server applications, devices that do not go to sleep. That said, not all is lost! By default Erlang will do time correction, meaning the Erlang system time slowly moves toward the OS system time. If you continue storing and reading messages, it will at some point catch up with the real time. But this can take a while. By default Erlang does NOT do time warps. This is what you would expect in your case. RabbitMQ is not configured for time warps either. With time warps, Erlang can immediately correct time when your laptop wakes up from sleep. You can find more information here: https://www.erlang.org/doc/apps/erts/time_correction.html#time-warp-modes - and this page has all the details for time correction if you are curious. If you want to use time warps, you can export |
Beta Was this translation helpful? Give feedback.
-
I will convert this issue to a GitHub discussion. Currently GitHub will automatically close and lock the issue even though your question will be transferred and responded to elsewhere. This is to let you know that we do not intend to ignore this but this is how the current GitHub conversion mechanism makes it seem for the users :( |
Beta Was this translation helpful? Give feedback.
-
Putting laptop to sleep or messing with OS time in other way will affect the absolute majority of data services in hard to predict ways. |
Beta Was this translation helpful? Give feedback.
-
If I am not mistaken, the timestamp used here is not Posix time then. The following RabbitMQ documentation suggests it is Posix though: |
Beta Was this translation helpful? Give feedback.
-
It seems that we cannot rely on using Posix time to fetch data from a stream. Also the client has no way of querying the offset timestamp. Offset timestamp needs to be stored client-side and then used on restart. But this can be also achieved with offset value. It looks like duplicated feature? |
Beta Was this translation helpful? Give feedback.
-
Not at all. The offset is precise. The timestamp is for approximate “read
from 10 minute ago” type of use cases. If you need precision store the last
offset you processed for later resumption.
On Fri, 19 Nov 2021 at 21:43, wrobell ***@***.***> wrote:
It seems that we cannot rely on using Posix time to fetch data from a
stream. Also the client has no way of querying the offset timestamp. Offset
timestamp needs to be stored client-side and then used on restart. But this
can be also achieved with offset value. It looks like duplicated feature?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3775 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJAHFCXUNAJEJNYKNG3BMLUM3AJHANCNFSM5IL2FOSA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
--
*Karl Nilsson*
|
Beta Was this translation helpful? Give feedback.
-
That's right, the client library needs to keep a reference of the requested offset and filter out the first messages of the first chunk. The library handles this usually, the application developer does not have to worry about it. |
Beta Was this translation helpful? Give feedback.
-
"Considering the problems above, timestamp feature seems equivalent"
The internal batching that goes on is purely an implementation detail and
isn't actually based on the producer batches even if in isolated
experiments it may seem so. That the client has to filter out messages is
also an implementation detail and something (smart) clients just have to
do. Offsets are guaranteed to be incrementing, timestamps are not, even if
we used monotonic time we may change the stream leader and take the next
timestamp on a different server with a different time drift. Hence they
cannot be considered equivalent. Offsets should be used to resume
processing at a known point. Timestamps are used when providing an
approximate time specification, such as "5 minutes ago" (5m).
That a non-existent consumer id offset query returns 0 is a bit unfortunate
as it isn't clear whether the consumer id was found or it was actually 0.
We may look at changing it to return a different response code in this case.
…On Mon, 22 Nov 2021 at 09:01, Arnaud Cogoluègnes ***@***.***> wrote:
- producer sends messages in batches of 100, requesting message with
offset 151 we get first message with offset 100 and you still need to do
filtering or deduplication
That's right, the client library needs to keep a reference of the
requested offset and filter out the first messages of the first chunk. The
library handles this usually, the application developer does not have to
worry about it.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3775 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJAHFACH7B6M5FJGL3GSCLUNIBHPANCNFSM5IL2FOSA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
--
*Karl Nilsson*
|
Beta Was this translation helpful? Give feedback.
-
Follow-up issue when no offset is stored: #3783. |
Beta Was this translation helpful? Give feedback.
-
Thanks for all the information and creating the bug for zero offset issue. Is there a reason why RabbitMQ shifts the responsibility of being "smart" to a client when offset is in the middle of a block? There are always multiple client libraries, so it seems it could be better to perform offset filtering RabbitMQ side? Regarding the timestamp. I would like to create pull request for RabbitMQ Streams protocol documentation to add the link to Erlang system time documentation. Would that be accepted? Actually, I would avoid using terms like "posix-ish" as this suggests it is OS time, I believe. For example, time can be changed after RabbitMQ restart and request to get data from last 5 minutes is not going to work until Erlang's system time catches up. |
Beta Was this translation helpful? Give feedback.
-
The protocol network adapter (stream plugin) delivers chunks of messages using
We would consider it, for sure. |
Beta Was this translation helpful? Give feedback.
-
I'm not convinced we should change documentation because of a use case no realistic piece of data service experiences outside of very specific circumstances (vMotion and the likes) in production. We already mention time clock synchronisation as important between cluster nodes. System time jumping around is going to affect virtually every piece of software, and Erlang's behavior when time shifts underneath the runtime is reasonable. |
Beta Was this translation helpful? Give feedback.
-
I feel this conversation has run its course. Coming up with examples of increasingly more obscure environments (sorry but most people do not run production data services on Raspberry Pis or laptops that are put to sleep) won't get us anywhere. I'll make sure that our clustering guide does mention time synchronisation between nodes but beyond that, time monotonicity is a topic that's important for so many features that we would be forced to add it to every guide. Or we can just rely on developer and operator common sense around this. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
First check timestamp of messages
Now, put laptop to sleep and without restarting the RabbitMQ broker
Restart Rabbit MQ broker
I would expect the timestamp to always match the actual time.
Beta Was this translation helpful? Give feedback.
All reactions