-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Likely O(n^2) behaviour for watchTree
#101
Comments
Hmm, I don't see anything that looks O(N^2) but a couple things jump out at me in the recursive watch code
We could improve 1) by gathering directories and adding watches in a streaming fashion, perhaps with Any chance you could create a flamegraph with https://hackage.haskell.org/package/ghc-prof-flamegraph ? |
I will try to find time to generate the flamegraph and get back to you. By the way you should be able to reproduce this behaviour solely based on the number of directories watched over -- tens of thousands of directories should provoke the behaviour. By the way 2, my friend @turboMaCk has tried a continuation-passing-style approach to listing contents of a directory recursively that seems to be pretty performant even on higher file/dir counts: https://github.com/turboMaCk/unix-recursive EDIT: given the profile info in the original post, I think your point 2 won't be the culprit, since it didn't show up in the profile info. It all seems to be in directory manipulation, canonicalization etc. |
GitHub didn't let me upload a SVG, so here is a gist: |
Thanks! Interesting how the biggest amount of time is actually in the Hmm, this directory listing function will still realize all the directories in memory at once, right? If we're optimizing that then I'd rather do it in a streaming fashion so the peak memory usage is low. |
@thomasjm There's also a possibility (that I'm currently doing in my app) of filtering certain folder IDs and not recursing deeper into them. I don't know how that fits into the FSNotify API / responsibilities, but perhaps it could be a configuration parameter. That could cut a large chunk of unneeded work in certain usecases (like mine). |
@thomasjm Is there a way I could try out the mentioned commit without a proper release? Does Stack support specifying the version with a commit hash perhaps? |
Well, you could use the non-recursive watch function to set the watches on exactly the folders you want.
Yes, like this in
Hopefully the work in progress I have going on in
|
sorry to jump in. I'm sort of in the middle of identifying different pieces which could be optimized to work better on large FS trees. I'm yet to identify exactly where the wins are but I'll let you know if I find something either by opening PR if there are some quick wins or by opening an issue where we can discuss more specific details. |
I had a deeper look into this issue both by analyzing isolated things as well as directly analyzing the performance of code @Janiczek observed this behavior on. For the record it's this commit Janiczek/elmid@8cf9636 but I also needed to change few hard-coded paths to make it possible for me to use. I'm able to reproduce higher than optimal CPU loads but as far as I can say code isn't O(n^2). It scales close to O(n) based on number of directories and files - that's being said I did not put much more effort into analyzing big O once I discovered the scaling is acceptable. For this and similar as this cases the biggest problem is inability of I also have few points to @thomasjm's first reply:
|
Thanks for the information @turboMaCk, interesting stuff. A few thoughts/questions:
I'm confused because your results for |
@thomasjm I know the answer to 1: it's the |
Yes but it took the same program 70s which is 10x longer to finish than |
Oh I see, I didn't notice the total time for the Anyway, it would be good to see your measurements on the I think we could invest effort in optimizing the crucial section here, which seems to be the On the other hand, it sounds like what you really need is a way to avoid running over gigantic directories like |
So I have a result for an implementation from master/HEAD. I've ran all other tests as well and basically the performance of other implementations is same as in link I posted so I won't spam this issue further. So CPU time is on par with dirstream, however current implementation consumes near 10GB of ram which is consistent with forcing the list too early from my other measurements. I also agree with all what you say in last post. I think in ideal case there would be 2 separate issues and PRs coming from this issue. a) provide new recursive API can help folks avoid into this trap of doing more work than is obvious/needed. I can help you with tackling these if you want. I think especially b) should be doable without changes in API even of this internal module. a) will involve some API design
While I think using conduit or pipes is a good way to develop application I personally don't like the idea to use it for library internals unless really necessary. Libraries with less dependencies easier to maintain and are often preferred by developers because they don't make as many changes for them. Reproducing my results:All the code including fsnotify implementation can be found there: https://github.com/turboMaCk/unix-recursive/tree/test-fsnotify-functions (make sure you're using branch just be aware
EDIT: Sorry one more thing I forget to mention. I think the fact that recursive directory walking is not as optimal as it should be at the moment is really not that big deal because with current |
Okay, I made a stab at an efficient implementation. There are a few TODOs before it's ready to ship but the tests are passing. Want to try it? I looked at your lib for reference on the dirstream stuff :) https://github.com/haskell-fswatch/hfsnotify/blob/master/src/System/FSNotify/Linux.hs#L193 For the filtering stuff, I think we want to just add a new watchTree function that takes an extensible options object: watchTree' :: WatchManager -> RecursiveListeningOptions -> FilePath -> ActionPredicate -> Action -> IO StopListening
data RecursiveListeningOptions = RecursiveListeningOptions {
recursiveListeningOptionsDirectoryFilter :: FilePath -> Bool -- IO Bool ?
}
defaultRecursiveListeningOptions = RecursiveListeningOptions {
recursiveListeningOptionsDirectoryFilter = const True
} I haven't had time to do the filtering stuff yet (it will need to be added to each platform) but a PR would be welcome if you want it soon. |
If this happens, could we please also export the underlying That's what I've always wanted from this library - I end up converting all the EDIT: The time of an abstract |
https://man7.org/linux/man-pages/man7/fanotify.7.html might be able to help with this? |
I've had a situation where I ran
watchTree
with default config over a directory with 170k files and 15k dirs.The recursive watch registration process ate ~400% CPU (as reported by htop) for about 20s, after which the CPU usage fell down to normal idle levels.
After trying to remove
node_modules
(erm...) the directory count went to 1k, and the CPU usage on startup was ~100% for about 2 seconds.This makes me think there's some O(n^2) behaviour in the fsnotify code. (For example,
inotifywait
CLI tool has no such CPU usage when doing the same task.)Here's a relevant snippet of profile information from the run with 15k dirs:
The other code paths not present above took ~0% of the measured time.
Let me know if I can help in some other way.
The text was updated successfully, but these errors were encountered: