allow multiple threads calling fuse_main simultaneously #19

pipcet · 2014-06-15T00:41:18Z

Currently, we don't support multiple threads each setting up a fuse file system if at least one of them also wants threaded callbacks. This test program:

#!/usr/bin/perl
use threads;
use threads::shared;
use Fuse;
use POSIX qw(EIO);

mkdir("a");
mkdir("b");
async { Fuse::main(mountpoint=>"a", threaded=>1, getattr => sub { warn "a"; sleep(1); return -EIO()}) };
async { Fuse::main(mountpoint=>"b", threaded=>1, getattr => sub { warn "b"; sleep(1); return -EIO()}) };
sleep(10);
system("ls a/");
system("ls b/");

will print "b" twice, and never print "a" (it's fine with threaded=>0 in both calls). The problem is the global master_interp variable, which we use to fake a thread context in CLONE_INTERP(). This very lightly-tested patch fixes that by putting the master interpreter into the fuse context's private data; I still have to convince myself the right thing happens when we create a thread from within a FUSE callback (which might call fuse_main itself, ugh...), but

Anyway, would such a patch be acceptable?

As an aside, it seems that the return value of PLfuse_init is now ignored, so I didn't bother "fixing" that function yet.

frett · 2014-06-15T02:19:59Z

I've thought about the fact that we basically use globals to track some Fuse state, which as you found leads to the inability to mount multiple filesystems within the same process. There are also memory leaks because we don't clean up after ourselves when a mount is unmounted (we don't close threads spawned in threaded mode).

I do like this patch, and it is along the lines of what I was thinking about for addressing the issue, the one concern I have is threaded access to user_private_data through fuse_get_context() because SV's should only be used with the interpreter they were created by. (I don't think we ever actually set user_private_data, so this may not be a current concern).

I'm going to try and incorporate this patch into #17 when I get a chance (probably won't be until mid-July). I do some additional things in that pull request that would need to be incorporated into fuse_private_data_t (such as tracking independent interpreter pools per mount).

pipcet · 2014-06-15T05:08:01Z

Thanks for the response, it's always great to get feedback! I think we're getting rid of the last true global with this patch, but continue to use MY_CXT data, which will become relevant in a minute.

We do set user_private_data from the return value of the "init" method, and that does indeed appear to be going very wrong if it's not undef, for lack of a CLONE method.

Have you seen Wolfgang Hamann's patch at https://rt.cpan.org/Public/Bug/Display.html?id=93472 ? That would allow us to have multiple file systems in a single thread, potentially, and then we couldn't use MY_CXT data at all.

I was going to ask, do you foresee any major issues moving all data to fuse_get_context()->private_data? There's a comment in Fuse.xs that I can't quite make sense of:
"threaded can't be tracked in MY_CXT, because MY_CXT isn't thread-safe on perl < 5.10 and threaded is used to make threading decisions"

I suspect all would be well with a proper CLONE method and all data in a non-global structure, deallocated when we unmount the file system (if we ever do), but if there are issues with it that I'm not seeing, it'd be great to hear about them first :-)

frett · 2014-06-15T05:46:34Z

moving all the variables from MY_CXT into the private_data struct will break threaded mode. Any SV private_data stores will only be good in the interpreter it was created in. This causes issues when using threaded mode because multiple internal interpreters are created to process the data. All of these interpreters need to have their own copy of any SV they will use created. The current design of private_data stores a single SV pointer, not one per interpreter.
This is the problem MY_CXT solves, it is where you should be storing any interpreters specific SVs (which they all should be, even "shared" SVs).

I saw Wolfgang's patch when it originally showed up, but haven't spent any time digging into it. But I agree that the current MY_CXT mechanism wouldn't work with having multiple fuse mounts running on the same interpreter.

I wrote the threaded in MY_CXT comment, it has to do with accessing an element out of the MY_CXT struct. We access the threaded flag (determine if we are actually in threaded mode) before we have locked an interpreter to a single request. This means that when using threaded mode multiple requests can come in simultaneously and all try to access the threaded variable at once. This isn't a problem post perl 5.10, but before perl 5.10 MY_CXT access is not thread-safe and multiple threads accessing it simultaneously introduce race conditions.

I believe putting the threaded flag in the private_data would be safe, and is probably a better place for it.

If you really did want to move all of MY_CXT into private_data you would need to track copies of all the SVs for every interpreter cloned. This is some of the logic that would have to be included in CLONE.

frett · 2014-06-15T05:59:47Z

I haven't looked into the event driven mode of FUSE, but I'm assuming that it would not be multi-threaded. This means that the cloned interpreter logic can go away when using the event based mode, and that eliminates a lot of the multiple interpreters accessing the same SV issues.

pipcet · 2014-06-16T03:11:20Z

I indeed ended up cloning all the data that's simultaneously per-THX and per-FS myself, in the noglobals branch at 2bbcf54 https://github.com/pipcet/perl-fuse/tree/noglobals. However, looking at Wolfgang's patch again, it seems his solution is far superior: we can write our own main loop and create as many threads as we desire without ever having to recycle THXes—we just never call pthread_create from C code, so that problem just goes away, I think.

In any case, I don't think there's a realistic scenario for having several per-thread-per-FS structures for the same thread: it can happen if you call Fuse::main from within a fuse handler, but if you really do need to launch new threads in response to Fuse events, it's probably best to set up a queue and create the threads from outside the fuse context.

allow multiple threads to provide fuse file system simultaneously

3a8eef1

fix obvious bug

c5b1c3d

oops. another quite obvious bug

b0f2b57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow multiple threads calling fuse_main simultaneously #19

allow multiple threads calling fuse_main simultaneously #19

pipcet commented Jun 15, 2014

frett commented Jun 15, 2014

pipcet commented Jun 15, 2014

frett commented Jun 15, 2014

frett commented Jun 15, 2014

pipcet commented Jun 16, 2014

allow multiple threads calling fuse_main simultaneously #19

Are you sure you want to change the base?

allow multiple threads calling fuse_main simultaneously #19

Conversation

pipcet commented Jun 15, 2014

frett commented Jun 15, 2014

pipcet commented Jun 15, 2014

frett commented Jun 15, 2014

frett commented Jun 15, 2014

pipcet commented Jun 16, 2014