-
Notifications
You must be signed in to change notification settings - Fork 397
MSC1719: olm session unwedging #1719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Yep, this looks like a good summary. Thanks! |
Actually the kicker here is that we don't know what session an incoming olm message was intended for if we can't decrypt it, so we can't mark that session as 'replaced'. Thinking about how else we could do it. |
An example implementation of this is https://github.com/matrix-org/matrix-js-sdk/pull/780/files |
should assume that the olm session has become corrupted and create a new olm | ||
session to replace it. It should then send a dummy message, using that | ||
session, to the other party in order to inform them of the new session. To | ||
send a dummy message, clients may send an event with type `m.dummy`, and with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can m.dummy
only be sent in the context of unwedging? If not, how should the client handle it? By ignoring the event the client opens itself up to problems similar to https://github.com/vector-im/riot-web/issues/3261
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably doesn't necessarily need to be in the context of unwedging, but I can't really think of any other reason a client would send a dummy message, and I think we could just define m.dummy
as "Ignore this message". This is a to_device message, and won't be displayed in any way to the user, so I don't think it will introduce any problems in the timeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I thought this was a timeline event. Would be good to clarify that it is a to_device event (if it isn't already - I'm willing to believe I just missed that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's kind of hidden in the fact that the message is olm-encrypted. I guess that technically, it could be a timeline event, if you used olm for your room encryption rather than megolm, but ... let's just say that people won't do that. ;)
@mscbot fcp merge |
Team member @uhoreg has proposed to merge this. The next step is review by the rest of the tagged people: No concerns currently listed. Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
## Proposal | ||
|
||
When a device receives an olm-encrypted message that it cannot decrypt, it | ||
should assume that the olm session has become corrupted and create a new olm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this is already working in practice - but sometimes I receive undecryptable messages that I can decrypt after a little why. Possibly this MSC is the reason it does eventually become decrypted? But I was always chalking it up to server lag. If the latter is the case, shouldn't we wait a little bit before attempting to start another session? Or does rate limiting make this all ok anyways?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With megolm, if you receive a message that you can't decrypt, then it could be that the key will come in later, and then you'll be able to decrypt it. With olm, if you receive a message that you can't decrypt, then there's no way to recover, and waiting for a while won't help.
send a dummy message, clients may send an event with type `m.dummy`, and with | ||
empty contents. | ||
|
||
In order to avoid creating too many extra sessions, a client should rate-limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can a client DoS another client by ignoring this rate limit? Does the server protect against this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A client could drain another client's one-time keys, preventing others from starting sessions. There isn't really a good way of fixing that. Note that this MSC doesn't change what a client is able to do, so any DoS that one client can perform with this MSC could also be done without the MSC.
Co-Authored-By: Andrew Morgan <[email protected]>
LGTM @mscbot reviewed |
|
||
An attacker could use this to create a new session on a device that they are | ||
able to read. However, this would require the attacker to have compromised the | ||
device's keys. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know much about the crypto here, but this feels like it could weaken security? If an attacker has compromised the device keys can they restart a session easily today? Do we want to warn the user this has happened?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, clients would use the session with the smallest session ID, so an attacker could still create a new session to be used. They'd might just need to try a few times before they got a session ID that was small enough.
I should also note that Signal basically does the same thing as this proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signal's documentation for this is https://signal.org/docs/specifications/sesame/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Other people are doing it it must be fine (TM)" :p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally lgtm. I've got a couple of questions.
and for sessions that have never received a message, it should use the creation | ||
time of the session. The spec will be changed to read: | ||
|
||
> If a client has multiple sessions established with another device, it should |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this sounds like it fixes element-hq/element-web#2330, and maybe element-hq/element-web#4011 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 4011 was an implementation issue that was fixed by matrix-org/matrix-js-sdk#857. But this does fix 2330 -- I had been waiting for more testing before closing it.
Co-Authored-By: Richard van der Hoff <[email protected]>
🔔 This is now entering its final comment period, as per the review above. 🔔 |
As per [MSC1719](#1719) No known alterations have been made to the proposal. Implementation proof: matrix-org/matrix-js-sdk#780
Spec PR for when this leaves FCP: #2059 (assuming no last minute comments) |
The final comment period, with a disposition to merge, as per the review above, is now complete. |
merged 🎉 |
Rendered