-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Element-R | Resetting key backup takes a long time; and blocks the whole application #26892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Is this not a duplicate of #26873? |
Could be one of the cause. This one is specifically after a backup reset. |
I actually meant #26783. But the issue is "Message sending is slow when there a lot of keys to back up", right? We should combine the two issues, imho. |
no, I misunderstood. The issue here is that the actual transaction to reset the key backup takes a very long time. (And if you restart during that time, everything is messed up) |
Plan:
|
Previous plan doesn't work without a migration because we can't efficiently query for records that don't have a matching Next idea: assign each backup version an order number, so each time we see a new backup version from the server, we set a In the backup loop we query inbound_group_sessions where The above would require a migration. Since we don't want to combine this with a migration to reduce value size, we can avoid a further migration by preparing now. Prep means: add the The logic within the backup loop looks like::
[Updated to use backed_up_to:-1 instead of a separate property] |
Some remarks on
What happens in situations were new backups version are created (without deleting the previous ones) without the client having beeing on to be aware of them. Then we start to delete them. Something like
The client is aware of version "1" and give it order 0 Now version "4" is deleted, making the old "3" as current. I guess it still works? but can there be some edge cases due to that? |
That's fine. The most recent backup is then version "3", and we will try to upload all of our keys to it. Of course, version "3" might already have all of our keys, so that would be somewhat wasted effort. But it'll sort itself out in the end. |
Not blocking Element R release because if we do #26930 we will be on-par with legacy crypto. (But we should still do this later.) |
FTR, with the new db format it took 3mn to reset all my keys (a bit less than 300k keys) instead of several 10s of minutes previously. |
To speed up key backup resets:
To remove legacy code and data:
|
I just had a conversation with @poljar and @richvdh about how we can implement this. Here are my notes from our conversation:
Thinking through the case when this has never happened before:
|
Had a further conversation with @richvdh today where we realised we can simplify this a bit, but still make use of the No need for backup_versions mappingWe realised we didn't need the
So instead of storing a backup order in No need to compare ordersPreviously, we were concerned about receiving an acknowledgement of a backed-up session for an old backup version, when actually a new backup was already in progress. Today, we reviewed the code in Thus, when we receive a backup ack we can safely assume it is for the latest version, and store that version next to the session so we know it has been backed up. Also, when looking for sessions to back up we can also just look for rows whose We also considered the case where the user deletes an new backup and we start re-using an old one. In this case, the order comparison we suggested was working against us, because we would refuse to back up a session to an "older" backup than the current one, but in the case where the new one has been deleted, we actually want to back up to the older one. So again, the correct comparison is Pre-existing backupIf we are looking for sessions to back up and we encounter one with In our previous idea we decided to treat Instead, we can simply track whether we are currently still on the first backup, or not. So, before Note: when another client resets, deletes or adds a backup, we will call I will describe the code we will write in the next comment. |
CodeDB migrationCreate a store called inbound_group_sessions_for_backupCheck inside the Search for sessions whose If Process and limit the list as before, and return it. mark_inbound_group_sessions_as_backed_upTo mark a session as backed up, remove its Add a comment to this code explaining that if we receive a call to this function with an old version then it can't be a race because of the reset_backup_stateInsert an entry into the |
Moving back to qualification following #29192 |
Step to reproduce
=> Starting now if you type a message, it won't be sent after several minutes depending on the number of room keys in database
What is happening
See here #26783 (comment)
Basically reseting backup will have to do a big write transaction that will lock the
inbound_group_sessions2
object store for several minutes.It will also lock all the database due to the rust
save_changes_lock
that is taken when saving changes after a sync=> Sync will be blocked, sending will be blocked
ANOTHER PROBLEM: If you close the tab or refresh the page, the transaction won't be retried so the keys won't be marked to be uploaded to the new backup
Difference with legacy
The
inbound_group_sessions2
rust object store is bigger (encrypted byte array of lenght ~3700) than the legacy one (json with pickled key of around (string of lenght ~700).So the rust db gets slower faster
The text was updated successfully, but these errors were encountered: