Queue index file cannot be recovered if it has messages with TTL on 3.8.17+ #3272
-
Hi, We've had two different clusters upgrading to 3.8.19 where the message store got corrupted:
We can share full logs and if of interest the data folder privately. |
Beta Was this translation helpful? Give feedback.
Replies: 9 comments 10 replies
-
More here: #3253 (comment) |
Beta Was this translation helpful? Give feedback.
-
It has happened again to a single node cluster running Erlang 24.0.4. |
Beta Was this translation helpful? Give feedback.
-
We at CloudAMQP have stopped provisioning 3.8.19 because so many customers gets corrupted messages stores from this version |
Beta Was this translation helpful? Give feedback.
-
Most recent queue index changes were #2954 and #3041 in |
Beta Was this translation helpful? Give feedback.
-
#2954 seems to be a lot more relevant than #3041. As of #2954, index files by default store 2048 entries instead of 16384. This value Nodes will log
on boot and the Without knowing if this was an upgrade and from what version, we can't tell what segment entry count value should be in effect. But you |
Beta Was this translation helpful? Give feedback.
-
Have been able to reproduce it in versions 3.8.19, 3.8.18 and 3.8.17. To reproduce:
Setting segment_entry_count for vhost 'X' with 0 queues to '2048' |
Beta Was this translation helpful? Give feedback.
-
We could reproduce and so far it seems that #3041 is the root cause. Reverting it seems to reliably make the issue go away. |
Beta Was this translation helpful? Give feedback.
-
@annieblomgren can you please try this alpha build? https://github.com/rabbitmq/rabbitmq-server-binaries-dev/releases/tag/v3.8.21-alpha.2 |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
@annieblomgren can you please try this alpha build? https://github.com/rabbitmq/rabbitmq-server-binaries-dev/releases/tag/v3.8.21-alpha.2