Skip to content

Commit

Permalink
Merge pull request #127 from lsst-sqre/tickets/DM-42331
Browse files Browse the repository at this point in the history
Add modernized tox environments/linting
  • Loading branch information
rufuspollock authored Jan 3, 2024
2 parents 90a1175 + 3b90ae6 commit bf397c7
Show file tree
Hide file tree
Showing 62 changed files with 3,228 additions and 1,745 deletions.
32 changes: 32 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: check-merge-conflict
- id: check-toml
# FIXME: VCR is unhappy; address in test rewrite
# - id: check-yaml
# args: [--allow-multiple-documents]
- id: trailing-whitespace

# FIXME: introduce after initial cleanup; it's going to take a lot
# of work.
# - repo: https://github.com/astral-sh/ruff-pre-commit
# rev: v0.1.8
# hooks:
# - id: ruff
# args: [--fix, --exit-non-zero-on-fix]
# - id: ruff-format

# FIXME: replace with ruff, eventually
- repo: https://github.com/psf/black
rev: 23.12.1
hooks:
- id: black

- repo: https://github.com/adamchainz/blacken-docs
rev: 1.16.0
hooks:
- id: blacken-docs
additional_dependencies: [black==23.10.1]
args: [-l, '79', -t, py311]
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Copyright 2020 Datopian (Viderum, Inc.)
Copyright 2020-2024 Datopian (Viderum, Inc.)

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ $(SENTINELS):
mkdir $@

$(SENTINELS)/dist-setup: | $(SENTINELS)
$(PIP) install -U pip wheel twine
$(PIP) install -U pip wheel twine pre-commit
@touch $@

$(SENTINELS)/dist: $(SENTINELS)/dist-setup $(DIST_DIR)/$(PACKAGE_NAME)-$(VERSION).tar.gz $(DIST_DIR)/$(PACKAGE_NAME)-$(VERSION)-py3-none-any.whl | $(SENTINELS)
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@ storage backends:

In addition, Giftless implements a custom transfer mode called `multipart-basic`,
which is designed to take advantage of many vendors' multipart upload
capabilities. It requires a specialized Git LFS client to use, and is currently
not supported by standard Git LFS.
capabilities. It requires a specialized Git LFS client to use, and is currently
not supported by standard Git LFS.

See the [giftless-client](https://github.com/datopian/giftless-client) project
for a compatible Python Git LFS client.
for a compatible Python Git LFS client.

Additional transfer modes and storage backends could easily be added and
configured.
Expand All @@ -34,7 +34,7 @@ configured.
Documentation
-------------
* [Installation Guide](https://giftless.datopian.com/en/latest/installation.html)
* [Getting Started](https://giftless.datopian.com/en/latest/quickstart.html)
* [Getting Started](https://giftless.datopian.com/en/latest/quickstart.html)
* [Full Documentation](https://giftless.datopian.com/en/latest/)
* [Developer Guide](https://giftless.datopian.com/en/latest/development.html)

Expand Down
7 changes: 7 additions & 0 deletions changelog.d/_template.md.jinja
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
<!-- Delete the sections that don't apply -->
{%- for cat in config.categories %}

### {{ cat }}

-
{%- endfor %}
133 changes: 69 additions & 64 deletions docs/source/auth-providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,18 @@ Authentication and Authorization Providers
==========================================

## Overview
Authentication and authorization in Giftless are pluggable and can easily be customized.
While Giftless typically bundles together code that handles both authentication and to
Authentication and authorization in Giftless are pluggable and can easily be customized.
While Giftless typically bundles together code that handles both authentication and to
some degree authorization, the two concepts should be understood separately first in order
to understand how they are handled by Giftless.
to understand how they are handled by Giftless.

* *Authentication* (sometimes abbreviated here and in the code as `authn`) relates to
* *Authentication* (sometimes abbreviated here and in the code as `authn`) relates to
validating the identity of the entity (person or machine) sending a request to Giftless
* *Authorization* (sometimes abbreviated as `authz`) relates to deciding, once an
identity has been established, whether the requesting party is permitted to perform
the requested operation
* *Authorization* (sometimes abbreviated as `authz`) relates to deciding, once an
identity has been established, whether the requesting party is permitted to perform
the requested operation

``` note:: In this guide and elsewhere we may refer to *auth* as a way of referring to
``` note:: In this guide and elsewhere we may refer to *auth* as a way of referring to
both authentication and authorization in general, or where distinction between the two
concepts is not important.
```
Expand All @@ -24,54 +24,54 @@ Giftless provides the following authentication and authorization modules by defa
* `giftless.auth.jwt:JWTAuthenticator` - uses [JWT tokens](https://jwt.io/) to both identify
the user and grant permissions based on scopes embedded in the token payload.
* `giftless.auth.allow_anon:read_only` - grants read-only permissions on everything to every
request; Typically, this is only useful in testing environments or in very limited
request; Typically, this is only useful in testing environments or in very limited
deployments.
* `giftless.auth.allow_anon:read_write` - grants full permissions on everything to every
request; Typically, this is only useful in testing environments or in very limited
request; Typically, this is only useful in testing environments or in very limited
deployments.

## Configuring Authenticators
Giftless allows you to specify one or more auth module via the `AUTH_PROVIDERS` configuration
key. This accepts a *list* of one or more auth modules. When a request comes in, auth modules will
be invoked by order, one by one, until an identity is established.
Giftless allows you to specify one or more auth module via the `AUTH_PROVIDERS` configuration
key. This accepts a *list* of one or more auth modules. When a request comes in, auth modules will
be invoked by order, one by one, until an identity is established.

For example:
```yaml
AUTH_PROVIDERS:
- factory: giftless.auth.jwt:factory
options:
options:
algorithm: HS256
private_key: s3cret,don'ttellany0ne
- giftless.auth.allow_anon:read_only
```
The config block above defines 2 auth providers: first, the `JWT` auth provider will be
The config block above defines 2 auth providers: first, the `JWT` auth provider will be
tried. If it manages to produce an identity (i.e. the request contains an acceptable JWT
token), it will be used. If the request does not cotain a `JWT` token, Giftless will fall
back to the next provider - in this case, the `allow_anon:read_only` provider which will
allow read-only access to anyone.
allow read-only access to anyone.

This allows servers to be set up to accept different authorization paradigms.

You'll notice that each item in the `AUTH_PROVIDERS` list can be either an object with
`factory` and `options` keys - in which case Giftless will load the auth module by
calling the `factory` Python callable (in the example above, the `factory` function in
the `giftless.auth.jwt` Python module); Or, in simpler cases, it can be just a string
`factory` and `options` keys - in which case Giftless will load the auth module by
calling the `factory` Python callable (in the example above, the `factory` function in
the `giftless.auth.jwt` Python module); Or, in simpler cases, it can be just a string
(as in the case of our 2nd provider), which will be treated as a `factory` value with
no options.

Read below for the `options` possible for specific auth modules.
Read below for the `options` possible for specific auth modules.

## JWT Authenticator
This authenticator authenticates users by accepting a well-formed [JWT token](https://jwt.io/)
in the Authorization header as a Bearer type token, or as the value of the `?jwt=` query
parameter. Tokens must be signed by the right key, and also match in terms of audience,
in the Authorization header as a Bearer type token, or as the value of the `?jwt=` query
parameter. Tokens must be signed by the right key, and also match in terms of audience,
issuer and key ID if configured, and of course have valid course expiry / not before times.

### Piggybacking on `Basic` HTTP auth
The JWT authenticator will also accept JWT tokens as the password for the `_jwt` user in `Basic` HTTP
`Authorization` header payload. This is designed to allow easier integration with clients that only support
Basic HTTP authentication.
Basic HTTP authentication.

You can disable this functionality or change the expected username using the `basic_auth_user` configuration option.

Expand All @@ -80,41 +80,41 @@ The following options are available for the `jwt` auth module:

* `algorithm` (`str`): JWT algorithm to use, e.g. `HS256` (default) or `RS256`. Must match the algorithm
used by your token provider
* `public_key` (`str`): Public key string, used to verify tokens signed with any asymmetric algorithm (i.e. all
* `public_key` (`str`): Public key string, used to verify tokens signed with any asymmetric algorithm (i.e. all
algorithms except `HS*`); Optional, not needed if a symmetric algorithm is in use.
* `public_key_file` (`str`): Path to file containing the public key. Specify as an alternative to `public_key`.
* `private_key` (`str`): Private key string, used to verify tokens signed with a symmetric algorithm (i.e. `HS*`);
Optional, not needed if an asymmetric algorithm is in use.
* `public_key_file` (`str`): Path to file containing the private key. Specify as an alternative to `private_key`.
* `public_key_file` (`str`): Path to file containing the public key. Specify as an alternative to `public_key`.
* `private_key` (`str`): Private key string, used to verify tokens signed with a symmetric algorithm (i.e. `HS*`);
Optional, not needed if an asymmetric algorithm is in use.
* `public_key_file` (`str`): Path to file containing the private key. Specify as an alternative to `private_key`.
* `leeway` (`int`): Key expiry time leeway in seconds (default is 60); This allows for a small clock time skew
between the key provider and Giftless server
* `key_id` (`str`): Optional key ID string. If provided, only keys with this ID will be accepted.
* `key_id` (`str`): Optional key ID string. If provided, only keys with this ID will be accepted.
* `basic_auth_user` (`str`): Optional HTTP Basic authentication username to look for when piggybacking on Basic
authentication. Default is `_jwt`. Can be set to `None` to disable inspecting `Basic` auth headers.
authentication. Default is `_jwt`. Can be set to `None` to disable inspecting `Basic` auth headers.

#### Options only used when module used for generating JWT tokens
The following options are currently only in use when the module is used for generating tokens for
The following options are currently only in use when the module is used for generating tokens for
self-signed requests (i.e. not as an `AUTH_PROVIDER`, but as a `PRE_AUTHORIZED_ACTION_PROVIDER`):

* `default_lifetime` (`int`): lifetime of token in seconds
* `default_lifetime` (`int`): lifetime of token in seconds
* `issuer` (`str`): token issuer (optional)
* `audience` (`str`): token audience (optional)

### JWT Authentication Flow
A typical flow for JWT is:

0. There is an external *trusted* system that can generate and sign JWT tokens and
0. There is an external *trusted* system that can generate and sign JWT tokens and
Giftless is configured to verify and accept tokens signed by this system
1. User is logged in to this external system
2. A JWT token is generated and signed by this system, granting permission to specific
2. A JWT token is generated and signed by this system, granting permission to specific
scopes applicable to Giftless
3. The user sends the JWT token along with any request to Giftless, using either
the `Authorization: Bearer ...` header or the `?jwt=...` query parameter
4. Giftless validates and decodes the token, and proceeds to grant permissions
based on the `scopes` claim embedded in the token.
4. Giftless validates and decodes the token, and proceeds to grant permissions
based on the `scopes` claim embedded in the token.

To clarify, it is up to the 3rd party identity / authorization provider to decide,
based on the known user identity, what scopes to grant.
To clarify, it is up to the 3rd party identity / authorization provider to decide,
based on the known user identity, what scopes to grant.

### Scopes
Beyond authentication, JWT tokens may also include authorization payload
Expand All @@ -131,15 +131,15 @@ or:

Where:

* `{org}` is the organization of the target object
* `{repo}` is the repository of the target object. Omitting or replacing with `*`
* `{org}` is the organization of the target object
* `{repo}` is the repository of the target object. Omitting or replacing with `*`
designates we are granting access to all repositories in the organization
* `{oid}` is the Object ID. Omitting or replacing with `*` designates we are granting
* `{oid}` is the Object ID. Omitting or replacing with `*` designates we are granting
access to all objects in the repository
* `{subscope}` can be `metadata` or omitted entirely. If `metadata` is specified,
the scope does not grant access to actual files, but to metadata only - e.g. objects
* `{subscope}` can be `metadata` or omitted entirely. If `metadata` is specified,
the scope does not grant access to actual files, but to metadata only - e.g. objects
can be verified to exist but not downloaded.
* `{actions}` is a comma separated list of allowed actions. Actions can be `read`, `write`
* `{actions}` is a comma separated list of allowed actions. Actions can be `read`, `write`
or `verify`. If omitted or replaced with a `*`, all actions are permitted.

### Examples
Expand Down Expand Up @@ -193,27 +193,27 @@ servers.

## Understanding Authentication and Authorization Providers

This part is more abstract, and will help you understand how Giftless handles
authentication and authorization in general. If you want to create a custom auth
module, or better understand how provided auth modules work, read on.
This part is more abstract, and will help you understand how Giftless handles
authentication and authorization in general. If you want to create a custom auth
module, or better understand how provided auth modules work, read on.

Giftless' authentication and authorization module defines two key interfaces for handling
authentication and authorization:

### Authenticators
Authenticator classes are subclasses of `giftless.auth.Authenticator`. One or more
authenticators can be configured at runtime, and each authenticator can try to obtain a
valid user identity from a given HTTP request.
Authenticator classes are subclasses of `giftless.auth.Authenticator`. One or more
authenticators can be configured at runtime, and each authenticator can try to obtain a
valid user identity from a given HTTP request.

Once an identity has been established, an `Identity` (see below) object will be returned,
and it is the role of the Authenticator class to populate this object with information about
the user, such as their name and email, and potentially, information on granted permissions.
the user, such as their name and email, and potentially, information on granted permissions.

Multiple authenticators can be chained, so that if one authenticator cannot find a valid
Multiple authenticators can be chained, so that if one authenticator cannot find a valid
identity in the request, the next authenticator will be called. If no authenticator manages
to return a valid identity, by default a `401 Unauthorized` response will be returned for
to return a valid identity, by default a `401 Unauthorized` response will be returned for
any action, but this behavior can be modified via the `@Authentication.no_identity_handler`
decorator.
decorator.

### Identity
Very simply, an `Identity` object encapsulates information about the current user making the
Expand All @@ -225,27 +225,32 @@ class Identity:
id: Optional[str] = None
email: Optional[str] = None
def is_authorized(self, organization: str, repo: str, permission: Permission, oid: Optional[str] = None) -> bool:
"""Tell if user is authorized to perform an operation on an object / repo
"""
def is_authorized(
self,
organization: str,
repo: str,
permission: Permission,
oid: Optional[str] = None,
) -> bool:
"""Tell if user is authorized to perform an operation on an object / repo"""
pass
```

Most notably, the `is_authorized` method will be used to tell whether the user, represented by
Most notably, the `is_authorized` method will be used to tell whether the user, represented by
the Identity object, is authorized to perform an action (one of the `Permission` values specified
below) on a given entity.
below) on a given entity.

Authorizer classes may use the default built-in `DefaultIdentity`, or implement an `Identity`
subclass of their own.
Authorizer classes may use the default built-in `DefaultIdentity`, or implement an `Identity`
subclass of their own.

#### Permissions
Giftless defines the following permissions on entites:

```python
class Permission(Enum):
READ = 'read'
READ_META = 'read-meta'
WRITE = 'write'
READ = "read"
READ_META = "read-meta"
WRITE = "write"
```

For example, if `Permission.WRITE` is granted on an object or a repository, the user will
Expand Down
Loading

0 comments on commit bf397c7

Please sign in to comment.