Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fscrypt backports to ses7p #512

Merged
merged 35 commits into from
Nov 28, 2023

Conversation

irq0
Copy link

@irq0 irq0 commented Nov 28, 2023

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

jtlayton and others added 30 commits October 26, 2023 15:24
…text

If the client sends an inode with a crypto context already in the xattr
buffer, ensure that we set the fscrypt flag in the inode.

Fixes: https://tracker.ceph.com/issues/50305
Signed-off-by: Jeff Layton <[email protected]>
(cherry picked from commit 3b1bf45)
A flag isn't sufficient as we can't reasonably use an xattr to store the
context. Switch the fscrypt fields to two vectors of opaque bytes, one
governed by AUTH caps and the other by FILE.

Also remove the special handling for encryption.ctx xattr, since we
won't be using that going forward anyway.

Signed-off-by: Jeff Layton <[email protected]>
(cherry picked from commit 1a593e1)
Add fscrypt_auth and fscrypt_file attributes to the inode_t encoding.

Signed-off-by: Jeff Layton <[email protected]>
(cherry picked from commit 798bfa6)
Signed-off-by: Jeff Layton <[email protected]>
(cherry picked from commit 7204cf1)
...and update those fields when the appropriate caps are
acquired/released/flushed.

Signed-off-by: Jeff Layton <[email protected]>
(cherry picked from commit ec892d3)
...and update the fscrypt_file field on setattr changes for size.

Signed-off-by: Jeff Layton <[email protected]>
(cherry picked from commit 9e40efc)
Signed-off-by: Jeff Layton <[email protected]>
(cherry picked from commit 33381ff)
Add new mask bits for those fields and update them if a setattr request
comes in with them.

Signed-off-by: Jeff Layton <[email protected]>
(cherry picked from commit a354393)
We don't really expect the userland client to use these (at least not
initially) but plumb in new vxattrs for fetching and setting these
fields. The new vxattr just fetches whatever is in the in-core inode,
and setting it issues a setattr under the hood.

Also, ensure that we update these fields on cap updates and client
requests.

Signed-off-by: Jeff Layton <[email protected]>
(cherry picked from commit b52dd9d)
Create a file and set both fscrypt_auth and fscrypt_file on it. Verify
that they are set, remount and verify it again.

Signed-off-by: Jeff Layton <[email protected]>
(cherry picked from commit 5d09332)
Encryption is currently only supported on files/directories with layouts
where stripe_count=1.  Forbid changing layouts when encryption is involved.

Signed-off-by: Luis Henriques <[email protected]>
(cherry picked from commit c9db274)
The clients will trust and need fscrypt_file field to truncate the
pagecaches and update the i_size.

Signed-off-by: Xiubo Li <[email protected]>
(cherry picked from commit 090810b)
The maximum size of the last block without the header along with
a truncate request when the fscrypt is enabled. Default value is
4KB.

Signed-off-by: Xiubo Li <[email protected]>
(cherry picked from commit 9dff0ba)
Signed-off-by: Xiubo Li <[email protected]>
(cherry picked from commit e1a980b)
For now the userland clients hasn't support the fscrypt yet, they
will be only allowed to read the encrypted files.

Signed-off-by: Xiubo Li <[email protected]>
(cherry picked from commit b73a048)
The kclient will only send truncate requests with the modified and
encrypted last block contents when new size is smaller and is not
aligned to CEPH_FSCRYPT_BLOCK_SIZE, which is 4KB for now.

Or if the Fx caps is issued and the new size is larger the kclient
will buffer the truncating. Or it will send truncate requests wihtout
the last block filled.

When the fscrypt is enabled and when truncating with a smaller size,
both the old size and new size in the truncate request will always
be rounded up to CEPH_FSCRYPT_BLOCK_SIZE, which is 4K for now, in
kclient. For example if truncating a file size from 3KB to 2KB, the
MDS will always get old_size == new_size == 4KB. So we need to check
whether there has last block data passed together with the truncate
request to make sure whether truncating to a smaller size.

The kclinet will send it's 'change_attr' along with the truncate req,
and the MDS will compare it with the one in CInode just after the MDS
successfully xlockes the CInode's filelock, if they are different
that means it's possibly some clients have update the file or have
dirty caps just before MDS xlockes the CInode's filelock. We will let
the kclient retry it by returning a -EAGAIN errno.

Then the MDS will write the last block to OSD and then truncate
the size as normal.

Currently the last block contents will be journaled together with
the project inode only and it will be cleared after the truncate
being finished, and won't make it persistent together with the
CInode:inode_t in the metadata pool.

Signed-off-by: Xiubo Li <[email protected]>
(cherry picked from commit 3b5a0ea)
The length of encrypted the snapshot name will be longer than normal
name and we need the whole encrypted name in kclient to do the base64
decode.

Signed-off-by: Xiubo Li <[email protected]>
(cherry picked from commit 5511dd7)
Signed-off-by: Yongseok Oh <[email protected]>
(cherry picked from commit 10fb8f1)
Signed-off-by: Yongseok Oh <[email protected]>
(cherry picked from commit a1a14ed)
This new configuration option will allow to define the maximum size for a
filesystem xattrs blob.  This is a filesystem-wide knob that will replace
the per-MDS mds_max_xattr_pairs_size option.

Note:

The kernel client patch to handle this new configuration was merged before
the corresponding ceph-side pull-request.  This was unfortunate because in
the meantime PR ceph#43284 was merged and the encoding/decoding of
'bal_rank_mask' got in between.  Hence the 'max_xattr_size' is being
encoding/decoded before 'bal_rank_mask'.

URL: https://tracker.ceph.com/issues/55725
Signed-off-by: Luís Henriques <[email protected]>
(cherry picked from commit 7b8def5)
Commit eb915d0 ("cephfs: fix write_buf's _len overflow problem") added
a limit to the total size of xattrs.  This limit is respected by clients
doing a "sync" operation, i.e. MDS_OP_SETXATTR.  However, clients with
CAP_XATTR_EXCL can still buffer these operations and ignore these limits.

This patch prevents clients from crashing the MDSs by also imposing the
xattr limits even when they have the Xx caps.  Replaces the per-MDS knob
"max_xattr_pairs_size" by the new mdsmap setting that the clients can
access.

Unfortunately, clients that misbehave, i.e. old clients that don't respect
this xattrs limit and buffer their xattrs, will see them vanishing.

URL: https://tracker.ceph.com/issues/55725
Signed-off-by: Luís Henriques <[email protected]>
(cherry picked from commit 13e07ff)

Changes from the original version:
Instead of the modification to options/mds.yaml.in which doesn't
exists in pacific remove the relevant options from common/options.cc
and common/legacy_config_opts.h
When doing a sync setxattr (MDS_OP_SETXATTR) to set the first xattr in an
inode it is possible for the client to set a huge xattr key/value that would
exceed the limit.  Prevent this from from happening by checking the size
against the maximum allowed size.

URL: https://tracker.ceph.com/issues/55725
Signed-off-by: Luís Henriques <[email protected]>
(cherry picked from commit ca1a397)
…p schema

Let the mdsmap schema know about the new field 'max_xattr_size'.  This prevents
the following error:

tasks.mgr.dashboard.helper._ValError: \
  In `input['fs_map']['filesystems'][0]['mdsmap']`: unknown keys: {'max_xattr_size'}

URL: https://tracker.ceph.com/issues/55725
Signed-off-by: Luís Henriques <[email protected]>
(cherry picked from commit bf7ddd8)
…ield

Signed-off-by: Luís Henriques <[email protected]>
(cherry picked from commit b70c752)
Signed-off-by: Luís Henriques <[email protected]>
(cherry picked from commit f3b8b52)
do_read() just uses the truncate_seq to tell how to cap the length of
the read. I see no reason that sparse reads should do anything
differently.

Change do_sparse_read() to cap the requested length at the truncate_size
if the truncate_seq in the request is newer than the one in the object.

Fixes: https://tracker.ceph.com/issues/54280
Signed-off-by: Jeff Layton <[email protected]>
(cherry picked from commit 58f3e8b)
jtlayton and others added 5 commits November 22, 2023 17:46
Without this limitation, the long snapshot names (those starting with '_')
may exceed the NAME_MAX size.

While there, document the 2 limitations to snapshot names.

Signed-off-by: Luís Henriques <[email protected]>
(cherry picked from commit 3ecf2a5)
Snapshot names have to be keep bellow 240 characters, otherwise the long
snapshot names (_<SNAPSHOT-NAME>_<INODE-NUMBER>) will become too big.

Add new testcase to check that the MDS limits snapshot names.

Signed-off-by: Luís Henriques <[email protected]>
(cherry picked from commit 2846ec6)
Add setxattr_cb: nullptr where we don't initialize that field. This is
necessary on older compilers that expect fields in declaration order.
Otherwise they fail with "non-trivial designated initializers not
supported".

Signed-off-by: Marcel Lauhoff <[email protected]>
@mgfritch mgfritch merged commit 774ce50 into SUSE:ses7p-fscrypt Nov 28, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants