Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for token prefix masking #20

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

eds-collabora
Copy link
Collaborator

@eds-collabora eds-collabora commented Nov 25, 2022

This builds upon PR #19 and adds support for Gitlab's new prefix masking feature, whereby token prefixes can be generically masked (e.g. you can mask all PATs in your logs, regardless of whether they are substituted from a variable or not).

There are basically two parts to this:

  • Parse the necessary Gitlab features to determine what token prefixes are needed.
  • Pass the required prefixes to the Masker

The rest of this PR (the parts that aren't included in #19) is testing. Note that some of the tests are taken directly from the upstream gitlab-runner (with a simple proc macro wrapper to convert them to Rust). This is to achieve maximum conformance with their implementation. Both this crate and their code are MIT licensed.

This is draft pending:

This brings in a dependency on the mask crate. The code to support
masking artifacts will come later. This only masks the logs passed to
Gitlab.
This is only a basic test that the masking is in fact being applied to
variables marked as masked. This is not a test of the functionality of
the masking crate, only of its integration into this crate.
Although we already automatically mask the trace sent to Gitlab, some
runners will produce additional log files or metadata, which must also
be masked to prevent secrets from being revealed in the artifact data.

This permits client crates to identify a file they wish to upload as
additionally requiring masking.

Note that the version bump in the masker crate is because we now need
to be able to clone the masker for each Uploader to use.
This is the other half of the capability signalling that Gitlab does.
Not only does a runner signal what features it is capable of supporting
when making a request, but the server also provides some information
back to the runner about what capabilities it provides.

There are currently three capabilities Gitlab can indicate it has
(this commit only adds parsing not support):

- Separating the trace into sections for easier viewing.
- Prefix masking (it's not totally clear why this is a Gitlab feature -
  but this is the mechanism by which the prefix masks are communicated to
  a runner willing to mask them).
- Failure reasons.

The end goal here is support for prefix masking, so that things like
PAT tokens are all automatically masked system-wide in logs.
If the features passed by the Gitlab server indicate that there are
token prefixes that should be masked, this adds those prefixes to the
Masker. This is fully transparent to users of this crate; the correct
thing will happen provided they correctly identify which artifacts
they believe should be masked. The trace will always be masked
correctly.
Allow token prefixes to be registered on the mock job. If they
are present, add them to the Gitlab features provided by the mock
with that job, as Gitlab itself does.
This is useful when testing whether the artifacts uploaded are as
expected.  In particular, we need this now to test whether masked
artifacts are actually being masked as we expect them to be.

Note that we are returning the raw binary of the zipfile, and we'll
need to use the zipfile crate to actually interpret this data. That
seems acceptable for now.
Using very common short names for tokens like 'token', when this is
also masked, can now have odd effects on other tests, because we may
unexpectedly mask out these words from log data. It's safer to use
slightly longer stand-ins. In particular, they conflict with Gitlab's
own tests of prefix matching.
This uses the actual source code of those tests, which is MIT licensed
(as is this crate), to be certain that we match the behaviour of the
upstream runner.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant