Containers should get a /run/user/<UID>/
tmpfs volume mount (maybe opt-in) to match Linux w/ systemd
#4776
Labels
sig/node
Categorizes an issue or PR as relevant to SIG Node.
Enhancement Description
/run/user/<UID>/
tmpfs volume mount to match Linux w/ systemd/run/user/<UID>/
tmpfs volume mount kubernetes#126394k/enhancements
) update PR(s):k/k
) update PR(s):k/website
) update PR(s):What would you like to be added?
Containers should get a
/run/user/<UID>/
tmpfs volume mount.Why is this needed?
Generally this is needed for conformance with the de facto standard that
systemd
sets by providing this. Part of this is in FHS, and part in XDG (see below). They provide this feature for the same sorts of reasons that we need it specifically: to store temporary data on a per-user basis with well-known names and not subject to attack via/tmp/
being world-writable (the "sticky" bit is insufficient to fully protect against attacks here). Why bother for Kubernetes? Well, because libraries may want to use/run/user/<UID>/
, and it's much easier to deal with the absence of that directory by just always having that directory. To make that directory's presence universal means that Kubernetes needs to provide it, at least optionally.We've implemented a token cache system that resembles Kerberos credentials caches in that they are kept in temporary storage, preferably
tmpfs
, but/tmp/
is not an appropriate place (see below) when that code is used on multi-user systems. We want to use temporary storage because we don't want these token caches to survive reboots, for example.Kubernetes pods and containers are not multi-user systems, but libraries that do this sort of caching need to easily support many kinds of environments. Therefore it would be nice to have
/run/user/<UID>/
be universally available, so that libraries can use it w/o concern about use on multi-user systems vs. single-user systems. Where/run/user/<UID>/
exists it is created by PAM, but obviously if it were to exist in Kubernetes containers it should be created by Kubernetes.Background:
/tmp/
is not appropriate for caches of this sort because on multi-user systems other users can mount attacks on such caches. When coded defensively such attacks can amount to no more than a denial-of-service, but still, it would be easier to safely code such caches if they could use a temporary location that is guaranteed to have the correct permissions (0700
) and where none of the parent directories can have world-writable permissions like0777
or01777
.Kerberos client libraries typically use files named
/tmp/krb5cc_<UID>
, or directories named similarly -- well-known names, notmkstemp()
ed names as these need to be easily found without having to look through a possibly-huge directory listing. In Unix time/run/user/<UID>/
is very new, and generally it is only ever created by a PAM, and PAM is not used in starting containers in Kubernetes, therefore/run/user/<UID>/
does not exist in Kubernetes containers.Because such files have to have well-known names, they can be subject to attack on multi-user systems. E.g., creating
/tmp/krb5cc_1000
as a symlink to some other file, or creating it as a regular0666
mode file, etc. Clients need to useO_NOFOLLOW
and/orlstat(2)
andfstat(2)
to make sure that they open only regular, non-symlink files, and they need to check thatgetuid()
owns the file, and that the file has appropriate permissions (0600
).Because we use
aud
in our tokens to limit their applicability we also have apps that need many tokens. Therefore we want to be able to cache them. Because some of our apps are multi-process apps, or invoke external short-lived programs that fetch and use tokens, we will be able to reduce load on our issuers by having a file-based cache as opposed to in-memory caches only. In our file-based cache of tokens we currently use a0700
/tmp/tokens_<UID>/
, with0600
regular files in there named after a hash of the issuer and audience of the token modulo the max number of tokens allowed in the cache, and we write the issuer, audience, expiration, token, and other metadata into those files, one token per-file. Ideally we would have/run/user/<UID>/
then we could use/run/user/<UID>/tokens/
(or some such). (Clearly managing the namespace in/run/user/<UID>/
may eventually be a problem, but at this time it is not, and anyway it won't be a problem for the Kubernetes community to manage.)/run/
is part of the FHS./run/user/<UID>/
is not part of the FHS, but a) it exists on Linux systems runningsystemd
, b) it is part of XDG and exists on FreeBSD. (I don't have a FreeBSD system to test on, but I think perhaps they use/run/user/${USER}/
rather than/run/user/<UID>/
. Using a UID is better than a username because it's always possible to callgetuid()
to get the UID, while a username is not always possible or as easy to obtain.)The text was updated successfully, but these errors were encountered: