-
Notifications
You must be signed in to change notification settings - Fork 17
/
Copy pathtopics-security.Rmd
155 lines (107 loc) · 8.47 KB
/
topics-security.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
# Security {#security-chapter}
When developing a package that uses secrets (API keys, [OAuth](https://blog.r-hub.io/2021/01/25/oauth-2.0/) tokens) and produces them (OAuth tokens, sensitive data),
* You want the secrets to be usable by you, collaborators and CI services, without being readable by anyone else;
* You want tests and checks (e.g. vignette building) that use the secrets to be turned off in environments where secrets won't be available (CRAN, forks of your development repository).
Your general attitude should be to think about:
* what are my secrets (an API key, an OAuth2.0 access token and the refresh token, etc.) and where/how exactly are there used (in the query part of an URL? as a header? which header, Authentication or something else?) -- packages like httr or httr2 might abstract some of the complexity for you but you need to really know where secrets are used and could be leaked,
* what could go wrong (e.g. your token ending up being published),
* how to prevent that (save your unedited token outside of your package, make sure it is not printed in logs or present in package check artefacts),
* how to fix mistakes (how do you deactivate a token and how do you check no one used it in the meantime).
## Managing secrets securely
### Follow best practice when developing your package
This book is about testing but security starts with how you develop your package.
To better protect your users' secret,
* It might be best not to let users pass API keys as parameters. It's best to have them save them in `.Renviron` or e.g. using the [keyring package](https://github.com/r-lib/keyring). This way, API keys are not in scripts. The [opencage package](https://docs.ropensci.org/opencage/articles/opencage.html#authentication-1) might provide some inspiration.
* If the API you are working with lets you pass keys either in the request headers or query string, prefer to use request headers.
### Share secrets with continuous integration services
You need to share secrets with continuous integration services... for real requests only!
For tests using vcr, httptest, httptest2 or webfakes, you at most need a fake secret, e.g. "foobar" as API key -- except for recording cassettes and mock files, but that is something you do locally.
::: {.alert .alert-dismissible .alert-primary}
In GitHub repositories, when storing a new secret, do not save it with quotes.
I.e. if your secret is "blabla", the field should contain `blabla`, not `"blabla"` nor `'blabla'`.
```{r secret, fig.alt="Screenshot of the interface for adding secrets in a GitHub repository, showing how the secret is stored without any quote."}
knitr::include_graphics("secret.png")
```
:::
#### API keys
For API keys, you can use something like GitHub repo secrets if you use GitHub Actions.
Then for the secret to be accessible as environment variable from your workflow in GitHub Actions [as explained in gargle docs](https://gargle.r-lib.org/articles/articles/managing-tokens-securely.html#provide-environment-variable-to-other-services-1) you need to add a line like
```yaml
env:
PACKAGE_PASSWORD: ${{ secrets.PACKAGE_PASSWORD }}
```
#### More complex objects
If your secret is an OAuth token, you might be able to re-create it from pieces, where the pieces are strings you can store as repo secrets much like what you'd do for an API key.
E.g. if your secret is an OAuth token, the [actual secrets](https://blog.r-hub.io/2021/01/25/oauth-2.0/#what-are-your-oauth-20-secret-credentials) are the access token and refresh token.
```yaml
env:
ACCESS_TOKEN: ${{ secrets.ACCESS_TOKEN }}
REFRESH_TOKEN: ${{ secrets.REFRESH_TOKEN }}
```
Therefore you could re-create it using e.g. the `credentials` argument of `httr::oauth2.0_token()`.
The re-creation using environment variables `Sys.getenv("ACCESS_TOKEN")` and `Sys.getenv("REFRESH_TOKEN")` would happen in a [testthat helper file](https://blog.r-hub.io/2020/11/18/testthat-utility-belt/).
### Secret files
For files, you will need to use encryption and to store a text-version of the encryption key/passwords as GitHub repo secret if you use GitHub Actions.
Read the documentation of the continuous integration service your are using to find out how secrets are protected and how you can use them in your builds.
See [gargle vignette about securely managing tokens](https://gargle.r-lib.org/articles/articles/managing-tokens-securely.html).
The approach is:
* Create your OAuth token locally, either outside of your package folder, or inside of it if you really want to, but **gitignored and Rbuildignored**.
* Encrypt it using e.g. the [user-friendly cyphr package](https://docs.ropensci.org/cyphr/). Save the code for this and for the step before in a file e.g. inst/secrets.R for when you need to re-create a token as even refresh tokens expire.
* For encrypting you need some sort of password. You will want to save it securely as _text_ in your [user-level .Renviron](https://rstats.wtf/r-startup.html#renviron) and in your GitHub repo secrets (or equivalent secret place for other CI services). E.g. create a key via `sodium_key <- sodium::keygen()` and get its text equivalent via `sodium::bin2hex(sodium_key)`. E.g. the latter command might give me `e46b7faf296e3f0624e6240a6efafe3dfb17b92ae0089c7e51952934b60749f2` and I will save this in .Renviron
```
MEETUPR_PWD="e46b7faf296e3f0624e6240a6efafe3dfb17b92ae0089c7e51952934b60749f2"
```
Example of a script creating and encrypting an OAuth token (for the Meetup API).
```r
# thanks Jenny Bryan https://github.com/r-lib/gargle/blob/4fcf142fde43d107c6a20f905052f24859133c30/R/secret.R
token_path <- testthat::test_path(".meetup_token.rds")
use_build_ignore(token_path)
use_git_ignore(token_path)
meetupr::meetup_auth(
token = NULL,
cache = TRUE,
set_renv = FALSE,
token_path = token_path
)
# sodium_key <- sodium::keygen()
# set_renv("MEETUPR_PWD" = sodium::bin2hex(sodium_key))
# set_renv being an internal function taken from rtweet
# that saves something to .Renviron
# get key from environment variable
key <- cyphr::key_sodium(sodium::hex2bin(Sys.getenv("MEETUPR_PWD")))
cyphr::encrypt_file(
token_path,
key = key,
dest = testthat::test_path("secret.rds")
)
```
* In tests you have a [setup / helper file](https://blog.r-hub.io/2020/11/18/testthat-utility-belt/#code-called-in-your-tests) with code like below.
```r
key <- cyphr::key_sodium(sodium::hex2bin(Sys.getenv("MEETUPR_PWD")))
temptoken <- tempfile(fileext = ".rds")
cyphr::decrypt_file(
testthat::test_path("secret.rds"),
key = key,
dest = temptoken
)
```
Now what happens in contexts where `MEETUPR_PWD` is not available?
Well there should be no tests using it!
See [our chapter about making real requests](#real-requests-chapter).
### Do not store secrets in the cassettes, mock files, recorded responses
* With vcr make sure to [configure vcr correctly](#vcr-security).
* With httptest and httptest2 only the response body (and headers, but not by default) are recorded. If those contains secrets, refer to the documentation about [redacting sensitive information](https://enpiar.com/r/httptest/articles/redacting.html) ([for httptest2](https://enpiar.com/httptest2/articles/redacting.html)).
* With webfakes you will be creating recorded responses yourself, make sure this process does not leak secrets. If you test something related to authentication, use fake secrets.
If the API you are interacting with uses OAuth for instance, make sure you are not leaking access tokens nor _refresh tokens_.
### Escape tests that require secrets
This all depends on your setup for testing [real requests](#real-requests-chapter).
You have to be sure no test requiring secrets will be run on [CRAN](#cran-preparedness) for instance.
## Sensitive recorded responses?
In that case you might want to gitignore the cassettes / mock files / recorded responses,
and skip the tests using them on continuous integration (e.g. `testthat::skip_on_ci()` or something more involved).
You'd also [Rbuildignore](https://blog.r-hub.io/2020/05/20/rbuildignore/) the cassettes / mock files / recorded responses, as you do not want to release them to CRAN.
## Further resources
Some tools might help you detect leaks or prevent them.
* [shhgit](https://github.com/eth0izzle/shhgit)'s goal is "Find secrets in your code. Secrets detection for your GitHub, GitLab and Bitbucket repositories".
* [Yelp's detect-secret](https://github.com/Yelp/detect-secrets) is "An enterprise friendly way of detecting and preventing secrets in code.".
* [git-secret](https://git-secret.io/) is a "bash tool to store your private data inside a git repo".