Skip to content

Commit

Permalink
Merge pull request #1224 from trapexit/readdir
Browse files Browse the repository at this point in the history
Add readdir policies
  • Loading branch information
trapexit committed Aug 11, 2023
2 parents d817fa4 + c92a100 commit 9849bcd
Show file tree
Hide file tree
Showing 22 changed files with 1,076 additions and 171 deletions.
49 changes: 28 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -256,6 +256,9 @@ These options are the same regardless of whether you use them with the
concatenated together with the longest common prefix removed.
* **func.FUNC=POLICY**: Sets the specific FUSE function's policy. See
below for the list of value types. Example: **func.getattr=newest**
* **func.readdir=seq|cosr|cor|cosr:INT|cor:INT**: Sets `readdir`
policy. INT value sets the number of threads to use for
concurrency. (default: seq)
* **category.action=POLICY**: Sets policy of all FUSE functions in the
action category. (default: epall)
* **category.create=POLICY**: Sets policy of all FUSE functions in the
Expand Down Expand Up @@ -682,9 +685,8 @@ rather than file paths, which were created by `open` or `create`. That
said many times the current FUSE kernel driver will not always provide
the file handle when a client calls `fgetattr`, `fchown`, `fchmod`,
`futimens`, `ftruncate`, etc. This means it will call the regular,
path based, versions. `readdir` has no real need for a policy given
the purpose is merely to return a list of entries in a
directory. `statfs`'s behavior can be modified via other options.
path based, versions. `statfs`'s behavior can be modified via other
options.

When using policies which are based on a branch's available space the
base path provided is used. Not the full path to the file in
Expand Down Expand Up @@ -715,8 +717,7 @@ In cases where something may be searched for (such as a path to clone)
### Policies

A policy is the algorithm used to choose a branch or branches for a
function to work on. Think of them as ways to filter and sort
branches.
function to work on or generally how the function behaves.

Any function in the `create` category will clone the relative path if
needed. Some other functions (`rename`,`link`,`ioctl`) have special
Expand All @@ -725,12 +726,11 @@ requirements or behaviors which you can read more about below.

#### Filtering

Policies basically search branches and create a list of files / paths
Most policies basically search branches and create a list of files / paths
for functions to work on. The policy is responsible for filtering and
sorting the branches. Filters include **minfreespace**, whether or not
a branch is mounted read-only, and the branch tagging
(RO,NC,RW). These filters are applied across all policies unless
otherwise noted.
(RO,NC,RW). These filters are applied across most policies.

* No **search** function policies filter.
* All **action** function policies filter out branches which are
Expand Down Expand Up @@ -823,6 +823,26 @@ policies is not appropriate.
| search | ff |


#### func.readdir

examples: `fuse.readdir=seq`, `fuse.readdir=cor:4`

`readdir` has policies to control how it manages reading directory
content.

| Policy | Description |
|--------|-------------|
| seq | "sequential" : Iterate over branches in the order defined. This is the default and traditional behavior found prior to the readdir policy introduction. |
| cosr | "concurrent open, sequential read" : Concurrently open branch directories using a thread pool and process them in order of definition. This keeps memory and CPU usage low while also reducing the time spent waiting on branches to respond. Number of threads defaults to the number of logical cores. Can be overwritten via the syntax `fuse.readdir=cosr:N` where `N` is the number of threads. |
| cor | "concurrent open and read" : Concurrently open branch directories and immediately start reading their contents using a thread pool. This will result in slightly higher memory and CPU usage but reduced latency. Particularly when using higher latency / slower speed network filesystem branches. Unlike `seq` and `cosr` the order of files could change due the async nature of the thread pool. Number of threads defaults to the number of logical cores. Can be overwritten via the syntax `fuse.readdir=cor:N` where `N` is the number of threads.

Keep in mind that `readdir` mostly just provides a list of file names
in a directory and possibly some basic metadata about said files. To
know details about the files, as one would see from commands like
`find` or `ls`, it is required to call `stat` on the file which is
controlled by `fuse.getattr`.


#### ioctl

When `ioctl` is used with an open file then it will use the file
Expand Down Expand Up @@ -891,19 +911,6 @@ returned but it will still be possible.
**link** uses the same strategy but without the removals.


#### readdir ####

[readdir](http://linux.die.net/man/3/readdir) is different from all
other filesystem functions. While it could have its own set of
policies to tweak its behavior at this time it provides a simple union
of files and directories found. Remember that any action or
information queried about these files and directories come from the
respective function. For instance: an **ls** is a **readdir** and for
each file/directory returned **getattr** is called. Meaning the policy
of **getattr** is responsible for choosing the file/directory which is
the source of the metadata you see in an **ls**.


#### statfs / statvfs ####

[statvfs](http://linux.die.net/man/2/statvfs) normalizes the source
Expand Down
86 changes: 66 additions & 20 deletions man/mergerfs.1
Original file line number Diff line number Diff line change
Expand Up @@ -337,6 +337,11 @@ policy.
See below for the list of value types.
Example: \f[B]func.getattr=newest\f[R]
.IP \[bu] 2
\f[B]func.readdir=seq|cosr|cor|cosr:INT|cor:INT\f[R]: Sets
\f[C]readdir\f[R] policy.
INT value sets the number of threads to use for concurrency.
(default: seq)
.IP \[bu] 2
\f[B]category.action=POLICY\f[R]: Sets policy of all FUSE functions in
the action category.
(default: epall)
Expand Down Expand Up @@ -844,8 +849,6 @@ provide the file handle when a client calls \f[C]fgetattr\f[R],
\f[C]fchown\f[R], \f[C]fchmod\f[R], \f[C]futimens\f[R],
\f[C]ftruncate\f[R], etc.
This means it will call the regular, path based, versions.
\f[C]readdir\f[R] has no real need for a policy given the purpose is
merely to return a list of entries in a directory.
\f[C]statfs\f[R]\[cq]s behavior can be modified via other options.
.PP
When using policies which are based on a branch\[cq]s available space
Expand Down Expand Up @@ -902,8 +905,7 @@ In cases where something may be searched for (such as a path to clone)
.SS Policies
.PP
A policy is the algorithm used to choose a branch or branches for a
function to work on.
Think of them as ways to filter and sort branches.
function to work on or generally how the function behaves.
.PP
Any function in the \f[C]create\f[R] category will clone the relative
path if needed.
Expand All @@ -912,12 +914,12 @@ have special requirements or behaviors which you can read more about
below.
.SS Filtering
.PP
Policies basically search branches and create a list of files / paths
for functions to work on.
Most policies basically search branches and create a list of files /
paths for functions to work on.
The policy is responsible for filtering and sorting the branches.
Filters include \f[B]minfreespace\f[R], whether or not a branch is
mounted read-only, and the branch tagging (RO,NC,RW).
These filters are applied across all policies unless otherwise noted.
These filters are applied across most policies.
.IP \[bu] 2
No \f[B]search\f[R] function policies filter.
.IP \[bu] 2
Expand Down Expand Up @@ -1134,6 +1136,63 @@ T}@T{
ff
T}
.TE
.SS func.readdir
.PP
examples: \f[C]fuse.readdir=seq\f[R], \f[C]fuse.readdir=cor:4\f[R]
.PP
\f[C]readdir\f[R] has policies to control how it manages reading
directory content.
.PP
.TS
tab(@);
lw(26.7n) lw(43.3n).
T{
Policy
T}@T{
Description
T}
_
T{
seq
T}@T{
\[lq]sequential\[rq] : Iterate over branches in the order defined.
This is the default and traditional behavior found prior to the readdir
policy introduction.
T}
T{
cosr
T}@T{
\[lq]concurrent open, sequential read\[rq] : Concurrently open branch
directories using a thread pool and process them in order of definition.
This keeps memory and CPU usage low while also reducing the time spent
waiting on branches to respond.
Number of threads defaults to the number of logical cores.
Can be overwritten via the syntax \f[C]fuse.readdir=cosr:N\f[R] where
\f[C]N\f[R] is the number of threads.
T}
T{
cor
T}@T{
\[lq]concurrent open and read\[rq] : Concurrently open branch
directories and immediately start reading their contents using a thread
pool.
This will result in slightly higher memory and CPU usage but reduced
latency.
Particularly when using higher latency / slower speed network filesystem
branches.
Unlike \f[C]seq\f[R] and \f[C]cosr\f[R] the order of files could change
due the async nature of the thread pool.
Number of threads defaults to the number of logical cores.
Can be overwritten via the syntax \f[C]fuse.readdir=cor:N\f[R] where
\f[C]N\f[R] is the number of threads.
T}
.TE
.PP
Keep in mind that \f[C]readdir\f[R] mostly just provides a list of file
names in a directory and possibly some basic metadata about said files.
To know details about the files, as one would see from commands like
\f[C]find\f[R] or \f[C]ls\f[R], it is required to call \f[C]stat\f[R] on
the file which is controlled by \f[C]fuse.getattr\f[R].
.SS ioctl
.PP
When \f[C]ioctl\f[R] is used with an open file then it will use the file
Expand Down Expand Up @@ -1246,19 +1305,6 @@ The above behavior will help minimize the likelihood of EXDEV being
returned but it will still be possible.
.PP
\f[B]link\f[R] uses the same strategy but without the removals.
.SS readdir
.PP
readdir (http://linux.die.net/man/3/readdir) is different from all other
filesystem functions.
While it could have its own set of policies to tweak its behavior at
this time it provides a simple union of files and directories found.
Remember that any action or information queried about these files and
directories come from the respective function.
For instance: an \f[B]ls\f[R] is a \f[B]readdir\f[R] and for each
file/directory returned \f[B]getattr\f[R] is called.
Meaning the policy of \f[B]getattr\f[R] is responsible for choosing the
file/directory which is the source of the metadata you see in an
\f[B]ls\f[R].
.SS statfs / statvfs
.PP
statvfs (http://linux.die.net/man/2/statvfs) normalizes the source
Expand Down
4 changes: 2 additions & 2 deletions src/config.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ Config::Config()
pid(::getpid()),
posix_acl(false),
readahead(0),
readdir(ReadDir::ENUM::POSIX),
readdir("seq"),
readdirplus(false),
rename_exdev(RenameEXDEV::ENUM::PASSTHROUGH),
scheduling_priority(-10),
Expand Down Expand Up @@ -162,6 +162,7 @@ Config::Config()
_map["func.mkdir"] = &func.mkdir;
_map["func.mknod"] = &func.mknod;
_map["func.open"] = &func.open;
_map["func.readdir"] = &readdir;
_map["func.readlink"] = &func.readlink;
_map["func.removexattr"] = &func.removexattr;
_map["func.rename"] = &func.rename;
Expand Down Expand Up @@ -189,7 +190,6 @@ Config::Config()
_map["pin-threads"] = &fuse_pin_threads;
_map["posix_acl"] = &posix_acl;
_map["readahead"] = &readahead;
// _map["readdir"] = &readdir;
_map["readdirplus"] = &readdirplus;
_map["rename-exdev"] = &rename_exdev;
_map["scheduling-priority"] = &scheduling_priority;
Expand Down
6 changes: 3 additions & 3 deletions src/config.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,15 @@
#include "config_log_metrics.hpp"
#include "config_moveonenospc.hpp"
#include "config_nfsopenhack.hpp"
#include "config_readdir.hpp"
#include "config_rename_exdev.hpp"
#include "config_set.hpp"
#include "config_statfs.hpp"
#include "config_statfsignore.hpp"
#include "config_xattr.hpp"
#include "config_set.hpp"
#include "enum.hpp"
#include "errno.hpp"
#include "funcs.hpp"
#include "fuse_readdir.hpp"
#include "policy.hpp"
#include "rwlock.hpp"
#include "tofrom_wrapper.hpp"
Expand Down Expand Up @@ -135,7 +135,7 @@ class Config
ConfigUINT64 pid;
ConfigBOOL posix_acl;
ConfigUINT64 readahead;
ReadDir readdir;
FUSE::ReadDir readdir;
ConfigBOOL readdirplus;
RenameEXDEV rename_exdev;
ConfigINT scheduling_priority;
Expand Down
13 changes: 13 additions & 0 deletions src/fs_devid.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#pragma once

#include "fs_fstat.hpp"
#include "fs_dirfd.hpp"


namespace fs
Expand All @@ -37,4 +38,16 @@ namespace fs

return st.st_dev;
}

static
inline
dev_t
devid(DIR *dh_)
{
int dirfd;

dirfd = fs::dirfd(dh_);

return fs::devid(dirfd);
}
}
4 changes: 2 additions & 2 deletions src/fs_dirfd.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ namespace fs
static
inline
int
dirfd(DIR *dirp_)
dirfd(DIR *dh_)
{
return ::dirfd(dirp_);
return ::dirfd(dh_);
}
}
13 changes: 13 additions & 0 deletions src/fs_inode.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,19 @@ namespace fs
return g_func(fusepath_,fusepath_len_,mode_,dev_,ino_);
}

uint64_t
calc(std::string const &fusepath_,
const mode_t mode_,
const dev_t dev_,
const ino_t ino_)
{
return calc(fusepath_.c_str(),
fusepath_.size(),
mode_,
dev_,
ino_);
}

void
calc(const char *fusepath_,
const uint64_t fusepath_len_,
Expand Down
6 changes: 5 additions & 1 deletion src/fs_inode.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,11 @@ namespace fs
const uint64_t fusepath_len,
const mode_t mode,
const dev_t dev,
const ino_t ion);
const ino_t ino);
uint64_t calc(std::string const &fusepath,
mode_t const mode,
dev_t const dev,
ino_t ino);
void calc(const char *fusepath,
const uint64_t fusepath_len,
struct stat *st);
Expand Down
Loading

0 comments on commit 9849bcd

Please sign in to comment.