Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sessions are not encrypted and leak user_id #603

Open
brassy-endomorph opened this issue Sep 20, 2024 · 7 comments
Open

Sessions are not encrypted and leak user_id #603

brassy-endomorph opened this issue Sep 20, 2024 · 7 comments
Assignees

Comments

@brassy-endomorph
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

Flask sessions are not encrypted, only signed/HMAC'd. This leaks user counts.

Describe the solution you'd like

Ideally both, but at least one of:

  • encrypt cookies
  • use UUIDs for user IDs
@rmlibre
Copy link
Contributor

rmlibre commented Sep 29, 2024

UUIDs are inferior to random strings that can be obtained from secrets.token_bytes. Additionally, since at least 2005, implementations have been following the RFC-4122 UUID standard, which explicitly states in §6

Do not assume that UUIDs are hard to guess; they should not be used
as security capabilities (identifiers whose mere possession grants
access), for example. A predictable random number source will
exacerbate the situation.

On top of that, even if the implementation uses a CSPRNG, UUID4 is limited to 122-bits, wasting entropy to metadata bits, and wastes space with its hex and hyphenated encoding.


Related to #528

@brassy-endomorph
Copy link
Collaborator Author

Are we trying to have "not at all guessable" or "not countable" because the latter is much more flexible, and if we use alternate IDs for sessions, we can have them be whatever high entropy value we want.

@rmlibre
Copy link
Contributor

rmlibre commented Oct 4, 2024

I'd like to solve this problem with a better designed session cookie. Namely, an encrypted session cookie which contains the current user_id and a session_token used for validation. This will allow us to keep the current user_id, but hide it from the user, and provide session validation with a new session_token column value. This approach should be simpler, since we won't have to change how the user_ids are created. And this will be more flexible, since we'll likely want to encrypt other things in the session cookie in future.

steps

These are the steps I'm preparing to follow to get this done.

  1. small PRs to in-parallel incorporate improvements from [DRAFT] feat: begin modernization of symmetric crypto [#268][#289][#357] #411:
    a. better KDF
    b. better encryption
    c. better domain separation
    d. better canonicalization
    e. will not change any existing functionality, only offer better options for new features
  2. small PR to add session token to User model
  3. final PR to swap user_id lookups in flask sessions which will:
    a. encrypt / decrypt cookie
    b(i). validate session token in cookie to db token
    b(ii). currently drafting up ideas for privacy preserving time-based token validation
    b(iii). also considering time-agnostic tokens that are replaced every login / deleted on logout
    c(i). insert / extract necessary fields using a canonical byte-wise encoding
    c(ii). start with only encrypting user_id & session_token for simplicity
  4. review to assess next steps

example

The implementation will look something like this in practice:

login

session["encrypted_cookie"] = encrypt_cookie(user_id, session_token, <other-values?>)
user.session_token = session_token
db.session.commit()

endpoints

user_id, session_token, <other-values?> = decrypt_cookie(session)
if (user := db.session.get(User, user_id)) is None:
    flash("🫥 User not found. Please log in again.")
    return redirect(url_for("login"))
if not bytes_are_equal(session_token, user.session_token):
    flash("⛔️ Invalid session token. Please log in again.")
    return redirect(url_for("login"))

Related to (#411, #528)

@brassy-endomorph
Copy link
Collaborator Author

small PRs to in-parallel incorporate improvements from

Just to be clear, you mean closing that one and adding one change at a time?

small PR to add session token to User model

If we want to support multiple distinct sessions that can be separately invalidated, we want a UserSession object with its own table.

final PR to swap user_id lookups in flask sessions which will:

This is a tall order for a single PR and each of these should probably be made their own ticket so implementation can be discussed separately.

For example, handling cookies like this:

session["encrypted_cookie"] = encrypt_cookie(user_id, session_token, <other-values?>)

Is not idiomatic. We can override the default session interface so this happens transparently throughout the app.

c(i). insert / extract necessary fields using a canonical byte-wise encoding

I'm not sure what this even means or is in reference to. Can you elaborate (or just make it its own ticket and dump knowledge there).

rmlibre added a commit to rmlibre/hushline that referenced this issue Oct 7, 2024
Git patch files can be downloaded and applied on any HEAD:
``git apply volumes/user_id_session_cookie_change_locations.patch``

This file was produced by saving the relevant diff:
``git diff > volumes/user_id_session_cookie_change_locations.patch``
rmlibre added a commit to rmlibre/hushline that referenced this issue Oct 7, 2024
@rmlibre
Copy link
Contributor

rmlibre commented Oct 7, 2024

small PRs to in-parallel incorporate improvements from

Just to be clear, you mean closing that one and adding one change at a time?

I'll be using #411 as a reference guide. So eventually it will be closed, once it's no longer needed.


small PR to add session token to User model

If we want to support multiple distinct sessions that can be separately invalidated, we want a UserSession object with its own table.

Yeah, that's a solid idea. In this case, would a log out invalidate all sessions by the same user, or just the current session?


final PR to swap user_id lookups in flask sessions which will:

This is a tall order for a single PR and each of these should probably be made their own ticket so implementation can be discussed separately.

For example, handling cookies like this:

session["encrypted_cookie"] = encrypt_cookie(user_id, session_token, <other-values?>)

Is not idiomatic. We can override the default session interface so this happens transparently throughout the app.

I really like the idea of overriding the default session object. The two approaches address different notions of a small PR. Overriding the session object may lead to fewer lines of code and may leave endpoints untouched, but it will affect the behavior of every session object interaction. Adding an encrypted field that's handled through an abstracting function leaves the existing behavior of all other fields untouched, but will need an implementation swap at the endpoints.

I'm open to either approach. Though overriding the default session object comes with higher risk, since we'd have to be fully acquainted with how flask handles the object to make sure we aren't failing to implement a required interface, or consider some subtle security implication. As a security auditor, it would be easier to reason about an encrypted field, than a reimplementation of a core object in a third party library.

So I could see clearly what needed to be changed, I found the locations and uploaded it as a patch file.

https://github.com/rmlibre/hushline/blob/e588c15dd8c1603b3fe00deb617ede260014478e/volumes/user_id_session_cookie_change_locations.patch

There's 41 locations in endpoints which would need a swap of implementation if an abstracting function were used. That means the swap would lead to a PR with maybe 80 lines of code changed.


c(i). insert / extract necessary fields using a canonical byte-wise encoding

I'm not sure what this even means or is in reference to. Can you elaborate (or just make it its own ticket and dump knowledge there).

There's a section in #411's summary titled Input Canonicalization with links. This is a way to encode distinct bytes-type values such that they can be encrypted and later decoded unambiguously. Though, I'm also considering json.dumps(...).encode(). Though it's less space efficient, it will be simpler to handle retrieving values by name when decoded back into a dict after being decrypted.


There are two routes which inject the user_id in the URL. This behavior should be changed so that it no longer injects this data. This can be handled in a separate PR as well.

https://github.com/rmlibre/hushline/blob/e588c15dd8c1603b3fe00deb617ede260014478e/hushline/templates/settings/admin.html#L40-L53

https://github.com/rmlibre/hushline/blob/e588c15dd8c1603b3fe00deb617ede260014478e/hushline/admin.py#L13-L23

https://github.com/rmlibre/hushline/blob/e588c15dd8c1603b3fe00deb617ede260014478e/hushline/admin.py#L25-L35

@brassy-endomorph
Copy link
Collaborator Author

Yeah, that's a solid idea. In this case, would a log out invalidate all sessions by the same user, or just the current session?

So maybe the MVP is just:

  • Table for sessions
  • Logging out kills only current session

Full list of things related to this that I think are standard would be:

  • Checkbox that on password change invalidates all other sessions
  • Sessions page in settings that shows:
    • list of sessions with date created, last active, device type (Linux/Android/Mac)
    • individual sessions can be deleted
    • button to nuke all sessions
  • a reaper process that hourly deletes sessions that are older than a configured lifetime

Overriding the session object may lead to fewer lines of code and may leave endpoints untouched, but it will affect the behavior of every session object interaction.

It doesn't. En/decryption happens outside the endpoints so in an endpoint we use it like this:

from flask import session

@app.route("/foo")
def foo():
    if val := session.get("wat"):
        return val
    session["wat"] = 'hello world'
    return 'wat: new'

This is complete invisible and we don't have to worry about manually encrypting the field or keeping tabs on what fields are or aren't encrypted. The sessions are HTTP-only, so there's no reason to leave any fields unencrypted.

Another option is to just set a session ID and pull the session itself from Redis (or PG or whatever, there's many options) so that the cookie is only the key to the ephemeral store. If we use Redis and set the lifetime for the sessions, we can avoid the need for a reaper process. Though I haven't done Flask-Login + Redis-sessions before, so I'm not totally sure that there's a super slick way to glue them together, but intuition says yes it should be quite easy.

Though overriding the default session object comes with higher risk

There are tons of examples and Flask endorses this. There are many examples of session interfaces from well-maintained Flask extensions. Relevant Flask docs: https://flask.palletsprojects.com/en/3.0.x/api/#flask.sessions.SessionInterface

it would be easier to reason about an encrypted field

If we do it simply (just add en/decryption to current object, no Redis), I'm pretty sure it's something like 10 significant lines of code for the entire thing.

But I'm less worried about auditability of the interface because many bugs come from people using interfaces wrong. Every dev has to remember to do the en/decryption on the relevant endpoints or they have to remember to assign fields to some imported LocalProxy like encrypted_session and not accidentally write to session. I suspect we're more likely to get bugs by moving further away from the standard interface than we are in implementing the interface itself.

There's 41 locations in endpoints which would need a swap of implementation if an abstracting function were used

I feel like this kinda proves my above point? That's a lot of place where changes have to be explicit instead of abstracted away from devs.


There are two routes which inject the user_id in the URL.

This comes back to a previous point which is do IDs used in the app (URLs, form fields, etc.) need to be "cryptographically secure totally random unguessable" or just "not an int you can easily count." I say the latter because all auth checks should be explicit every time in which case an int is as secure as a UUID (with the exception about leaking counts). This also helps prevent table bloat.

Also this is a fairly big/annoying change, so that alone should be limited to a single PR because it would need significant migration tests as well.

@brassy-endomorph
Copy link
Collaborator Author

Also just as a note, a lot of our auth logic is based around us doing things like this:

@app.route("/foo")
@authentication_required
def foo():
    user_id = session.get('user_id', None)
    user = db.session.get(User, user_id)
    if user is None:
        # abort/redirect logic
    return user.id

This has a double auth check: @authentication_required and if user is None.

If we use Flask-Login + encrypted sessions, we get this:

from flask_login import login_required, current_user

@app.route("/foo")
@login_required
def foo():
    return current_user.id

This isn't strictly related to this ticket in the sense that Flask-Login doesn't solve the issue, but we almost certainly want Flask-Login in which case we want to just modify the SessionInterface so that extensions pick up our secure session interface without us having to do extra work. The Flask-Login docs have an example about overriding it and how the global config is picked up by the extension: https://flask-login.readthedocs.io/en/latest/#disabling-session-cookie-for-apis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In-Progress On Hold
Development

No branches or pull requests

2 participants