Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NAS-123903 / NFS41 ACL support for NFS4 protocol #142

Merged
merged 5 commits into from
Dec 4, 2023

Conversation

bmeagherix
Copy link
Contributor

@bmeagherix bmeagherix commented Nov 27, 2023

Second Pull-Request

This is the second version pull request wrt this change. The first (#138) was to make the changes to the truenas/linux-6.1 branch, so it has now been abandoned and used as the basis of this PR. (Unfortunately, since the same branch name was used, it is no longer possible to examine the deltas against 6.1) The commits in this PR incorporates the changes made following review of the last PR.

Background

ZFS currently exposes native ZFS nfsv4 ACL type through the system.nfs4_acl_xdr xattr.

A previous PR was "Implement native NFSv4 ACLs in NFS server" which allowed the ZFS ACL to be exposed thru the existing NFS 4.0 ACL support in nfsd. One limitation is that the NFSv4.0 ACL specification does not allow the acl_flag to be included.

NFSv4.1 added support for a slightly more capable variant of ACL, the DACL (and SACL) which also includes support for an acl_flag field.

Upstream linux includes client-side support for DACLs, but not server-side support. Because no in-kernel filesystems have suitable ACL support, it seems very unlikely that changes to nfsd to support DACLs will be accepted upstream.

Overview

This PR adds:

  • server-side DACL support, based on the underlying ZFS system.nfs4_acl_xdr xattr.
  • client-side support for system.nfs4_acl_xdr xattr, utilizing the existing client DACL support (for transport).

This will permit existing tools that operate on the system.nfs4_acl_xdr xattr (e.g. nfs4xdr_setfacl) to operate on a NFS v4.1 mounted client.

Implementation

The PR has been split into 5 initial commits.

  1. Add a flag to struct nfs4_acl and move enum nfs4_acl_type for reuse in nfsd.
    This will permit struct nfs4_acl to also support DACLs (or SACL). The flag will remain zero for ACL.
    We will use the nfs4_acl_type enum at various call-sites to determine whether we're operating on an ACL or DACL. (This pattern was already being used by nfs client in the upstream kernel; move the enum so we can do likewise in nfsd.)

  2. Add DACL support to nfsd for NFSv4.1+
    DACL support is modeled on the existing ACL support, and the existing nfs infrastructure for attr handling does most of the heavy lifting here.
    Some functions (e.g. nfsd4_get_nfs4_acl) have their signatures widened to include an acl_type enum parameter. We add FATTR4_WORD1_DACL to the NFSD4_1_SUPPORTED_ATTRS_WORD1 and NFSD_WRITEABLE_ATTRS_WORD1 bitmasks to allow plumbing DACL through the various existing routines, but note that this is tempered by IS_NFSV4ACL checks.

  3. Move some (iX-authored) nfs4_acl_xdr handling functions from nfsd to nfs_common for reuse
    These were authored during the above-mentioned NFSv4.0 PR & used in nfsd. We now want to use them in both the nfs server and client.

  4. Add isdir parameter to generate_nfs41acl_buf and populate acl flags
    When generating the (ZFS) buffer we also want to be able to set the ACL_IS_DIR flag in the acl_flag. Since this is not included in the NFSv4.1 DACL acl_flag bits, we do not send it over the wire. However, we can regenerate it if the inode in question refers to a directory. Hence the plumbing.

  5. Add nfs client-side support for system.nfs4_acl_xdr
    The new accessor functions for the system.nfs4_acl_xdr xattr are implemented using the existing client-side support for DACL for transport of the DACL to/from the nfsd.
    We then need to perform a struct nfs4_acl to/from DACL XDR, implemented by nfs4_aclbuf_decode and nfs4_encode_acl_to_buf. (BTW, even if it were accessible, the existing nfsd code to do the equivalent functionality is deeply embedded into the server-side infrastructure & unsuitable for reuse here.) Only OWNER@, GROUP@, EVERYONE@ and numeric uids or gids are supported in the ACEs, but this turns out to not be a restriction here, as that is how the system.nfs4_acl_xdr presents/transfers them anyway.
    Finally, encoding/decoding is required to translate the struct nfs4_acl to/from the ZFS-style ACL using the functions moved to nfs_common.

Limitations / caveats

The existing upstream nfs client contains a nfs4_cached_acl. This can hold either the ACL or DACL (or SACL). If, for example, the DACL has been cached by the client and then the DACL is changed on the server outside the scope of NFS, then the client will have no knowledge of this change and will retain the old value for a period of time. (Appears to be cached several seconds to 10s of seconds, based on activity.) This is the case even for upstream POSIX ACL-based NFSv4.0 ACL (system.nfs4_acl), and before this PR.

This can be worked around by e.g. fetching the ACL and then fetching the DACL again to force refresh the cache.

Test

Jenkins "API Tests - TrueNAS SCALE with middleware reinstalled from a branch Incremental" job #1400 passed, and exercised this functionality (middleware branch NAS-123903).

@bmeagherix bmeagherix added the WIP Work in Progress label Nov 27, 2023
@bugclerk
Copy link

@bmeagherix
Copy link
Contributor Author

retest this please

@bmeagherix bmeagherix removed the WIP Work in Progress label Nov 28, 2023
Copy link
Contributor

@yocalebo yocalebo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no obvious issues, but made 1 very small little comment that you may ignore.

fs/nfs_common/nfsacl.c Outdated Show resolved Hide resolved
@bmeagherix
Copy link
Contributor Author

Tidied. Folded commits in response to review into prior commits. (Checked zero delta to resultant PR.)

fs/nfs/nfs4proc.c Outdated Show resolved Hide resolved
fs/nfs/nfs4proc.c Outdated Show resolved Hide resolved
fs/nfs/nfs4proc.c Outdated Show resolved Hide resolved
fs/nfs/nfs4proc.c Show resolved Hide resolved
fs/nfsd/nfs4acl.c Show resolved Hide resolved
fs/nfs/nfs4proc.c Outdated Show resolved Hide resolved
fs/nfs/nfs4proc.c Outdated Show resolved Hide resolved
fs/nfs/nfs4proc.c Show resolved Hide resolved
fs/nfs/nfs4proc.c Outdated Show resolved Hide resolved
fs/nfs/nfs4proc.c Outdated Show resolved Hide resolved
fs/nfs/nfs4proc.c Outdated Show resolved Hide resolved
fs/nfs_common/nfsacl.c Outdated Show resolved Hide resolved
@bmeagherix bmeagherix requested a review from amotin December 4, 2023 15:30
fs/nfs/nfs4proc.c Outdated Show resolved Hide resolved
Also move enum nfs4_acl_type for reuse.
A DACL differs in size from an ACL by the addition of an acl_flag.

DACL support is modeled on the existing ACL support.  Some functions
(e.g. nfsd4_get_nfs4_acl) have their signatures widened to include a
acl_type parameter.
… reuse

Also now use ACE4SIZE and add a check wrt remaining in
convert_nfs41xdr_to_nfs40_acl / convert_to_nfs40_ace
The nfs4_xattr_list_nfs4_acl_xdr, nfs4_xattr_get_nfs4_acl_xdr and
nfs4_xattr_set_nfs4_acl_xdr functions are implemented using the existing
client-side support for DACL.

Currently only OWNER@, GROUP@, EVERYONE@ and numeric uids or gids are
supported in the ACEs.
@bmeagherix bmeagherix merged commit f7d0333 into truenas/linux-6.6 Dec 4, 2023
6 checks passed
@bmeagherix bmeagherix deleted the NAS-123903 branch December 4, 2023 18:54
@bugclerk
Copy link

bugclerk commented Dec 4, 2023

Not updating JIRA ticket https://ixsystems.atlassian.net/browse/NAS-123903 target versions as no JIRA version corresponds to this PR

usaleem-ix pushed a commit that referenced this pull request Jan 10, 2024
* Add flag to struct nfs4_acl to allow dacl support

Also move enum nfs4_acl_type for reuse.

* Add DACL support to nfsd (v4.1+)

A DACL differs in size from an ACL by the addition of an acl_flag.

DACL support is modeled on the existing ACL support.  Some functions
(e.g. nfsd4_get_nfs4_acl) have their signatures widened to include a
acl_type parameter.

* Move some nfs4_acl_xdr handling functions from nfsd to nfs_common for reuse

Also now use ACE4SIZE and add a check wrt remaining in
convert_nfs41xdr_to_nfs40_acl / convert_to_nfs40_ace

* Add isdir parameter to generate_nfs41acl_buf and populate acl flags

* Add nfs client-side support for NA41_NAME ("system.nfs4_acl_xdr")

The nfs4_xattr_list_nfs4_acl_xdr, nfs4_xattr_get_nfs4_acl_xdr and
nfs4_xattr_set_nfs4_acl_xdr functions are implemented using the existing
client-side support for DACL.

Currently only OWNER@, GROUP@, EVERYONE@ and numeric uids or gids are
supported in the ACEs.
wongsyrone pushed a commit to wongsyrone/truenas-scale-linux that referenced this pull request Apr 9, 2024
[ Upstream commit 7d42e09 ]

During the PCI AER system's error recovery process, the kernel driver
may encounter a race condition with freeing the reset_data structure's
memory. If the device restart will take more than 10 seconds the function
scheduling that restart will exit due to a timeout, and the reset_data
structure will be freed. However, this data structure is used for
completion notification after the restart is completed, which leads
to a UAF bug.

This results in a KFENCE bug notice.

  BUG: KFENCE: use-after-free read in adf_device_reset_worker+0x38/0xa0 [intel_qat]
  Use-after-free read at 0x00000000bc56fddf (in kfence-truenas#142):
  adf_device_reset_worker+0x38/0xa0 [intel_qat]
  process_one_work+0x173/0x340

To resolve this race condition, the memory associated to the container
of the work_struct is freed on the worker if the timeout expired,
otherwise on the function that schedules the worker.
The timeout detection can be done by checking if the caller is
still waiting for completion or not by using completion_done() function.

Fixes: d8cba25 ("crypto: qat - Intel(R) QAT driver framework")
Cc: <[email protected]>
Signed-off-by: Damian Muszynski <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix pushed a commit that referenced this pull request May 1, 2024
[ Upstream commit 7d42e09 ]

During the PCI AER system's error recovery process, the kernel driver
may encounter a race condition with freeing the reset_data structure's
memory. If the device restart will take more than 10 seconds the function
scheduling that restart will exit due to a timeout, and the reset_data
structure will be freed. However, this data structure is used for
completion notification after the restart is completed, which leads
to a UAF bug.

This results in a KFENCE bug notice.

  BUG: KFENCE: use-after-free read in adf_device_reset_worker+0x38/0xa0 [intel_qat]
  Use-after-free read at 0x00000000bc56fddf (in kfence-#142):
  adf_device_reset_worker+0x38/0xa0 [intel_qat]
  process_one_work+0x173/0x340

To resolve this race condition, the memory associated to the container
of the work_struct is freed on the worker if the timeout expired,
otherwise on the function that schedules the worker.
The timeout detection can be done by checking if the caller is
still waiting for completion or not by using completion_done() function.

Fixes: d8cba25 ("crypto: qat - Intel(R) QAT driver framework")
Cc: <[email protected]>
Signed-off-by: Damian Muszynski <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
@usaleem-ix usaleem-ix restored the NAS-123903 branch October 28, 2024 06:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants