Replace calls to ls and rm with perl functions. #42

rapier1 · 2021-04-09T16:42:15Z

This avoids issues when there are too many cache files for ls to process with a wildcard. While this doesn't happen all that often I've run into problems when moving very large data sets. Especially when they had files with widely varying sizes. I think this means some of the checks for too many cache files can be removed as well but I just wanted to submit the basics at this point. I haven't seen any notable performance issues even when processing 50,000 cache files.

I also removed trailing whitespace from some of the line (M-x delete-trailing-whitespace in emacs).

This avoids issues when there are too many cache files for ls to process with a wildcard.

The new method of counting the number of cache files makes this unnecessary.

hjmangalam · 2021-04-09T19:02:40Z

Thanks very much Chris. Good points - a lot of the code was pulled together quite haphazardly (as you might have noticed). I'm ambivalent about getting rid of the fpart file number limits since too many fpart files have impacts on other bits of the filesystem and the churn when starting too many rsyncs. I'm finishing up some other work, but I'll try to merge these over the weekend. Harry

…

On Fri, Apr 9, 2021 at 9:42 AM Chris Rapier ***@***.***> wrote: This avoids issues when there are too many cache files for ls to process with a wildcard. While this doesn't happen all that often I've run into problems when moving very large data sets. Especially when they had files with widely varying sizes. I *think* this means some of the checks for too many cache files can be removed as well but I just wanted to submit the basics at this point. I haven't seen any notable performance issues even when processing 50,000 cache files. I also removed trailing whitespace from some of the line (M-x delete-trailing-whitespace in emacs). ------------------------------ You can view, comment on, or merge this pull request online at: #42 Commit Summary - Replace calls to ls and rm with perl functions. File Changes - *M* parsyncfp <https://github.com/hjmangalam/parsyncfp/pull/42/files#diff-73be673404f947b738e0d81bf0a44761fb097919372f259198c6b5fb4c5f9a17> (270) Patch Links: - https://github.com/hjmangalam/parsyncfp/pull/42.patch - https://github.com/hjmangalam/parsyncfp/pull/42.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#42>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AASF3Y53D35TFWA3EQEMW33TH4U7XANCNFSM42VJZ6WQ> .

-- Harry Mangalam

rapier1 · 2021-04-09T19:39:35Z

Glad to be of help. Just so you have an example - we used parsyncfp to move our primary data storage system to a new system. Something like 6 to 7 PB of data. Both filesystems were using lustre (which is its own issue). I wrote a wrapper so users could fire off their own runs of parsyncfp as slurm jobs. We ended up using an NP of 16 and a chunksize of -4G to override the cache limit. This was necessary as we had some users with more than 1PB of data. We dedicated 4 slurm node to these jobs and usually had 2 or 3 people per node. These are a *beefy* nodes with 64 cores and 128 threads and fully dedicated to parsyncfp tasks. We were seeing throughput peaking at 2500MB/s and averaging at 853MB/s (thought the media was probably closer to 1200). So I don't think we were seeing that much in the way of thrashing but we did have really good equipment for this. That said, the switch to disable the MAX_FPART works fine. Chris

…

On 4/9/21 3:02 PM, Harry Mangalam wrote: Thanks very much Chris. Good points - a lot of the code was pulled together quite haphazardly (as you might have noticed). I'm ambivalent about getting rid of the fpart file number limits since too many fpart files have impacts on other bits of the filesystem and the churn when starting too many rsyncs. I'm finishing up some other work, but I'll try to merge these over the weekend. Harry On Fri, Apr 9, 2021 at 9:42 AM Chris Rapier ***@***.***> wrote: > This avoids issues when there are too many cache files for ls to process > with a wildcard. While this doesn't happen all that often I've run into > problems when moving very large data sets. Especially when they had files > with widely varying sizes. I *think* this means some of the checks for > too many cache files can be removed as well but I just wanted to submit the > basics at this point. I haven't seen any notable performance issues even > when processing 50,000 cache files. > > I also removed trailing whitespace from some of the line (M-x > delete-trailing-whitespace in emacs). > ------------------------------ > You can view, comment on, or merge this pull request online at: > > #42 > Commit Summary > > - Replace calls to ls and rm with perl functions. > > File Changes > > - *M* parsyncfp > <https://github.com/hjmangalam/parsyncfp/pull/42/files#diff-73be673404f947b738e0d81bf0a44761fb097919372f259198c6b5fb4c5f9a17> > (270) > > Patch Links: > > - https://github.com/hjmangalam/parsyncfp/pull/42.patch > - https://github.com/hjmangalam/parsyncfp/pull/42.diff > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#42>, or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AASF3Y53D35TFWA3EQEMW33TH4U7XANCNFSM42VJZ6WQ> > . > -- Harry Mangalam — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#42 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKL66BQMMUXAZAGYAJFDBDTH5FOHANCNFSM42VJZ6WQ>.

hjmangalam · 2021-04-13T23:10:51Z

Hi Chris, Your patches are lighter weight than the system calls I had, so I'll include them going forward, but I'm still concerned about eliminating the $MAX_FPART_FILES simply bc a novice will use a value that generates literally 10s of 1000s of them and unless I'm missing something, that's not something you want, since starting up a bazillion rsyncs takes time as well. There's a tradeoff between early starts (lots of tiny fpart chunks) and late starts (smaller numbers of larger fpart chunks). In fact this is something I'll mention to Ganael (fpart's author) - can fpart be told to chunk X number of small files (for startup) and then switch to larger chunks fo the main run. Also, I've gotten the multihost version running and after some testing on a fast net, I'll probably be releasing it within a week - it's a major reworking of the code so I'll include your code, but not as a simple pull. Also, did you get a chance to look at the RoundRobin changes I suggested? Did any of them work the way you wanted? Best wishes and thanks for your contribution to pfp. Harry On Fri, Apr 9, 2021 at 12:39 PM Chris Rapier ***@***.***> wrote:

…

Glad to be of help. Just so you have an example - we used parsyncfp to move our primary data storage system to a new system. Something like 6 to 7 PB of data. Both filesystems were using lustre (which is its own issue). I wrote a wrapper so users could fire off their own runs of parsyncfp as slurm jobs. We ended up using an NP of 16 and a chunksize of -4G to override the cache limit. This was necessary as we had some users with more than 1PB of data. We dedicated 4 slurm node to these jobs and usually had 2 or 3 people per node. These are a *beefy* nodes with 64 cores and 128 threads and fully dedicated to parsyncfp tasks. We were seeing throughput peaking at 2500MB/s and averaging at 853MB/s (thought the media was probably closer to 1200). So I don't think we were seeing that much in the way of thrashing but we did have really good equipment for this. That said, the switch to disable the MAX_FPART works fine. Chris On 4/9/21 3:02 PM, Harry Mangalam wrote: > Thanks very much Chris. Good points - a lot of the code was pulled together > quite haphazardly (as you might have noticed). > I'm ambivalent about getting rid of the fpart file number limits since too > many fpart files have impacts on other bits of the filesystem and the churn > when starting too many rsyncs. > I'm finishing up some other work, but I'll try to merge these over the > weekend. > Harry > > On Fri, Apr 9, 2021 at 9:42 AM Chris Rapier ***@***.***> > wrote: > > > This avoids issues when there are too many cache files for ls to process > > with a wildcard. While this doesn't happen all that often I've run into > > problems when moving very large data sets. Especially when they had files > > with widely varying sizes. I *think* this means some of the checks for > > too many cache files can be removed as well but I just wanted to > submit the > > basics at this point. I haven't seen any notable performance issues even > > when processing 50,000 cache files. > > > > I also removed trailing whitespace from some of the line (M-x > > delete-trailing-whitespace in emacs). > > ------------------------------ > > You can view, comment on, or merge this pull request online at: > > > > #42 > > Commit Summary > > > > - Replace calls to ls and rm with perl functions. > > > > File Changes > > > > - *M* parsyncfp > > > < https://github.com/hjmangalam/parsyncfp/pull/42/files#diff-73be673404f947b738e0d81bf0a44761fb097919372f259198c6b5fb4c5f9a17 > > > (270) > > > > Patch Links: > > > > - https://github.com/hjmangalam/parsyncfp/pull/42.patch > > - https://github.com/hjmangalam/parsyncfp/pull/42.diff > > > > — > > You are receiving this because you are subscribed to this thread. > > Reply to this email directly, view it on GitHub > > <#42>, or unsubscribe > > > < https://github.com/notifications/unsubscribe-auth/AASF3Y53D35TFWA3EQEMW33TH4U7XANCNFSM42VJZ6WQ > > > . > > > > > -- > > Harry Mangalam > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#42 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/AAKL66BQMMUXAZAGYAJFDBDTH5FOHANCNFSM42VJZ6WQ >. > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#42 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AASF3YYNRGAATGZGBFHVMDLTH5JYXANCNFSM42VJZ6WQ> .

-- Harry Mangalam

Chris Rapier added 2 commits April 9, 2021 12:36

Replace calls to ls and rm with perl functions.

8017a5f

This avoids issues when there are too many cache files for ls to process with a wildcard.

Remove MAX_FPART_FILES and associated code

6aa20be

The new method of counting the number of cache files makes this unnecessary.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace calls to ls and rm with perl functions. #42

Replace calls to ls and rm with perl functions. #42

rapier1 commented Apr 9, 2021

hjmangalam commented Apr 9, 2021 via email

rapier1 commented Apr 9, 2021 via email

hjmangalam commented Apr 13, 2021 via email

Replace calls to ls and rm with perl functions. #42

Are you sure you want to change the base?

Replace calls to ls and rm with perl functions. #42

Conversation

rapier1 commented Apr 9, 2021

hjmangalam commented Apr 9, 2021 via email

rapier1 commented Apr 9, 2021 via email

hjmangalam commented Apr 13, 2021 via email