Skip to content

MSC1719: olm session unwedging #1719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 4, 2019
Merged

Conversation

uhoreg
Copy link
Member

@uhoreg uhoreg commented Nov 14, 2018

@dbkr
Copy link
Member

dbkr commented Nov 14, 2018

Yep, this looks like a good summary. Thanks!

@dbkr
Copy link
Member

dbkr commented Nov 14, 2018

Actually the kicker here is that we don't know what session an incoming olm message was intended for if we can't decrypt it, so we can't mark that session as 'replaced'. Thinking about how else we could do it.

@uhoreg
Copy link
Member Author

uhoreg commented Nov 14, 2018

An example implementation of this is https://github.com/matrix-org/matrix-js-sdk/pull/780/files

@uhoreg uhoreg added e2e proposal A matrix spec change proposal T-Core labels Nov 14, 2018
@uhoreg uhoreg changed the title MSC: olm session unwedging MSC1719: olm session unwedging Nov 15, 2018
should assume that the olm session has become corrupted and create a new olm
session to replace it. It should then send a dummy message, using that
session, to the other party in order to inform them of the new session. To
send a dummy message, clients may send an event with type `m.dummy`, and with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can m.dummy only be sent in the context of unwedging? If not, how should the client handle it? By ignoring the event the client opens itself up to problems similar to https://github.com/vector-im/riot-web/issues/3261

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably doesn't necessarily need to be in the context of unwedging, but I can't really think of any other reason a client would send a dummy message, and I think we could just define m.dummy as "Ignore this message". This is a to_device message, and won't be displayed in any way to the user, so I don't think it will introduce any problems in the timeline.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought this was a timeline event. Would be good to clarify that it is a to_device event (if it isn't already - I'm willing to believe I just missed that).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of hidden in the fact that the message is olm-encrypted. I guess that technically, it could be a timeline event, if you used olm for your room encryption rather than megolm, but ... let's just say that people won't do that. ;)

@uhoreg
Copy link
Member Author

uhoreg commented May 22, 2019

@mscbot fcp merge

@mscbot
Copy link
Collaborator

mscbot commented May 22, 2019

Team member @uhoreg has proposed to merge this. The next step is review by the rest of the tagged people:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@mscbot mscbot added proposed-final-comment-period Currently awaiting signoff of a majority of team members in order to enter the final comment period. disposition-merge labels May 22, 2019
## Proposal

When a device receives an olm-encrypted message that it cannot decrypt, it
should assume that the olm session has become corrupted and create a new olm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this is already working in practice - but sometimes I receive undecryptable messages that I can decrypt after a little why. Possibly this MSC is the reason it does eventually become decrypted? But I was always chalking it up to server lag. If the latter is the case, shouldn't we wait a little bit before attempting to start another session? Or does rate limiting make this all ok anyways?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With megolm, if you receive a message that you can't decrypt, then it could be that the key will come in later, and then you'll be able to decrypt it. With olm, if you receive a message that you can't decrypt, then there's no way to recover, and waiting for a while won't help.

send a dummy message, clients may send an event with type `m.dummy`, and with
empty contents.

In order to avoid creating too many extra sessions, a client should rate-limit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can a client DoS another client by ignoring this rate limit? Does the server protect against this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A client could drain another client's one-time keys, preventing others from starting sessions. There isn't really a good way of fixing that. Note that this MSC doesn't change what a client is able to do, so any DoS that one client can perform with this MSC could also be done without the MSC.

Co-Authored-By: Andrew Morgan <[email protected]>
@anoadragon453
Copy link
Member

LGTM

@mscbot reviewed


An attacker could use this to create a new session on a device that they are
able to read. However, this would require the attacker to have compromised the
device's keys.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know much about the crypto here, but this feels like it could weaken security? If an attacker has compromised the device keys can they restart a session easily today? Do we want to warn the user this has happened?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, clients would use the session with the smallest session ID, so an attacker could still create a new session to be used. They'd might just need to try a few times before they got a session ID that was small enough.

I should also note that Signal basically does the same thing as this proposal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signal's documentation for this is https://signal.org/docs/specifications/sesame/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I guess

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Other people are doing it it must be fine (TM)" :p

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally lgtm. I've got a couple of questions.

and for sessions that have never received a message, it should use the creation
time of the session. The spec will be changed to read:

> If a client has multiple sessions established with another device, it should
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sounds like it fixes element-hq/element-web#2330, and maybe element-hq/element-web#4011 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 4011 was an implementation issue that was fixed by matrix-org/matrix-js-sdk#857. But this does fix 2330 -- I had been waiting for more testing before closing it.

uhoreg and others added 2 commits May 29, 2019 15:01
Co-Authored-By: Richard van der Hoff <[email protected]>
@mscbot mscbot added final-comment-period This MSC has entered a final comment period in interest to approval, postpone, or delete in 5 days. and removed proposed-final-comment-period Currently awaiting signoff of a majority of team members in order to enter the final comment period. labels May 30, 2019
@mscbot
Copy link
Collaborator

mscbot commented May 30, 2019

🔔 This is now entering its final comment period, as per the review above. 🔔

@turt2live turt2live self-assigned this May 30, 2019
turt2live added a commit that referenced this pull request May 30, 2019
As per [MSC1719](#1719)

No known alterations have been made to the proposal.

Implementation proof: matrix-org/matrix-js-sdk#780
@turt2live turt2live mentioned this pull request May 30, 2019
@turt2live turt2live added spec-pr-in-review A proposal which has been PR'd against the spec and is in review and removed proposal-in-review labels May 30, 2019
@turt2live
Copy link
Member

Spec PR for when this leaves FCP: #2059 (assuming no last minute comments)

@mscbot
Copy link
Collaborator

mscbot commented Jun 4, 2019

The final comment period, with a disposition to merge, as per the review above, is now complete.

@mscbot mscbot removed the final-comment-period This MSC has entered a final comment period in interest to approval, postpone, or delete in 5 days. label Jun 4, 2019
@turt2live turt2live merged commit b92b147 into matrix-org:master Jun 4, 2019
@turt2live
Copy link
Member

merged 🎉

@turt2live turt2live added merged A proposal whose PR has merged into the spec! and removed finished-final-comment-period spec-pr-in-review A proposal which has been PR'd against the spec and is in review labels Jun 4, 2019
bmarty added a commit to element-hq/element-android that referenced this pull request Apr 20, 2020
bmarty added a commit to element-hq/element-android that referenced this pull request Apr 20, 2020
@turt2live turt2live added the kind:core MSC which is critical to the protocol's success label Apr 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge e2e kind:core MSC which is critical to the protocol's success merged A proposal whose PR has merged into the spec! proposal A matrix spec change proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants