Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RF] Make RooBatchCompute thread safe #14877

Merged
merged 4 commits into from
Mar 5, 2024

Conversation

guitargeek
Copy link
Contributor

In the RooBatchCompute CPU library, all scalar inputs have to be copied
n times into a buffer that is as long as the SIMD registers, to allow
for vectorization in all cases.

To avoid frequent memory allocations, this buffer was made a static
variable in the original implementation of the batchcompute library,
which of course made it non-threadsafe.

This is now hitting us, because RooFit needs to be able to do multiple
fits concurrently. This is a requirement for CMSSW, and a blocker for
ROOT master adoption in CMSSW since the new CPU backend is the default:
cms-sw/cmsdist#9034

This commit fixes the concurrency problem by doing the buffering in the
DataMaps that are used in the RooFit::Evaluator. Like this, multiple
computation graphs can be concurrently evaluated.

It was tested with the ATLAS benchmarks in rootbench that the fitting
performance remains the same.

This PR also makes some code style and memory management improvements documented in the commit messages.

It also updates the documentation to make clear the new CPU backend is now the default, which was not done in this PR:
#14742

Copy link

github-actions bot commented Mar 4, 2024

Test Results

    12 files      12 suites   2d 8h 12m 42s ⏱️
 2 579 tests  2 579 ✅ 0 💤 0 ❌
28 987 runs  28 987 ✅ 0 💤 0 ❌

Results for commit 2abf7e5.

♻️ This comment has been updated with latest results.

This can be quite a big overhead if the dataset is small.
@root-project root-project deleted a comment from phsft-bot Mar 4, 2024
@root-project root-project deleted a comment from phsft-bot Mar 4, 2024
@root-project root-project deleted a comment from phsft-bot Mar 4, 2024
@root-project root-project deleted a comment from phsft-bot Mar 4, 2024
@root-project root-project deleted a comment from phsft-bot Mar 4, 2024
@phsft-bot
Copy link
Collaborator

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-ubuntu2004/python3.
Running on root-ubuntu-2004-1.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2024-03-04T15:56:58.395Z] FAILED: roofit/roofitcore/CMakeFiles/RooFitCore.dir/src/RooPolyVar.cxx.o
  • [2024-03-04T15:56:59.328Z] /home/sftnight/build/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooPolyVar.cxx:148:43: error: no matching function for call to ‘std::vector<std::__ROOT::span<const double> >::push_back(std::array<double, 64>&)’

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-ubuntu2204/nortcxxmod.
Running on root-ubuntu-2204-2.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2024-03-04T16:02:52.337Z] /home/sftnight/build/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooPolyVar.cxx:148:21: error: no matching function for call to ‘std::vector<std::__ROOT::span<const double> >::push_back(std::array<double, 64>&)’

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-performance-centos8-multicore/soversion.
Running on olbdw-01.cern.ch:/data/sftnight/workspace/root-pullrequests-build
See console output.

Errors:

  • [2024-03-04T16:08:14.691Z] /data/sftnight/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooPolyVar.cxx:148:43: error: no matching function for call to ‘std::vector<std::__ROOT::span<const double> >::push_back(std::array<double, 64>&)’

@phsft-bot
Copy link
Collaborator

Build failed on windows10/default.
Running on null:C:\build\workspace\root-pullrequests-build
See console output.

Errors:

  • [2024-03-04T16:50:52.798Z] C:\build\workspace\root-pullrequests-build\root\roofit\roofitcore\src\RooPolyVar.cxx(148,43): error C2664: 'void std::vector<std::__ROOT::span<const double>,std::allocator<std::__ROOT::span<const double>>>::push_back(const _Ty &)': cannot convert argument 1 from 'std::array<double,64>' to 'const _Ty &' [C:\build\workspace\root-pullrequests-build\build\roofit\roofitcore\RooFitCore.vcxproj]

@phsft-bot
Copy link
Collaborator

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-ubuntu2204/nortcxxmod.
Running on root-ubuntu-2204-2.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2024-03-04T18:13:02.099Z] /home/sftnight/build/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooPolyVar.cxx:150:21: error: no matching function for call to ‘std::vector<std::__ROOT::span<const double> >::push_back(std::__ROOT::span<double>&)’

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-ubuntu2004/python3.
Running on root-ubuntu-2004-1.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2024-03-04T18:17:01.719Z] FAILED: roofit/roofitcore/CMakeFiles/RooFitCore.dir/src/RooPolyVar.cxx.o
  • [2024-03-04T18:17:02.649Z] /home/sftnight/build/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooPolyVar.cxx:150:51: error: no matching function for call to ‘std::vector<std::__ROOT::span<const double> >::push_back(std::__ROOT::span<double>&)’

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-performance-centos8-multicore/soversion.
Running on olbdw-01.cern.ch:/data/sftnight/workspace/root-pullrequests-build
See console output.

Errors:

  • [2024-03-04T18:29:49.251Z] /data/sftnight/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooPolyVar.cxx:150:51: error: no matching function for call to ‘std::vector<std::__ROOT::span<const double> >::push_back(std::__ROOT::span<double>&)’

@phsft-bot
Copy link
Collaborator

Build failed on windows10/default.
Running on null:C:\build\workspace\root-pullrequests-build
See console output.

Errors:

  • [2024-03-04T19:00:19.942Z] C:\build\workspace\root-pullrequests-build\root\roofit\roofitcore\src\RooPolyVar.cxx(150,51): error C2664: 'void std::vector<std::__ROOT::span<const double>,std::allocator<std::__ROOT::span<const double>>>::push_back(const _Ty &)': cannot convert argument 1 from 'std::__ROOT::span<double>' to 'const _Ty &' [C:\build\workspace\root-pullrequests-build\build\roofit\roofitcore\RooFitCore.vcxproj]

@phsft-bot
Copy link
Collaborator

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

In the RooBatchCompute CPU library, all scalar inputs have to be copied
n times into a buffer that is as long as the SIMD registers, to allow
for vectorization in all cases.

To avoid frequent memory allocations, this buffer was made a `static`
variable in the original implementation of the batchcompute library,
which of course made it non-threadsafe.

This is now hitting us, because RooFit needs to be able to do multiple
fits concurrently. This is a requirement for CMSSW, and a blocker for
ROOT master adoption in CMSSW since the new CPU backend is the default:
cms-sw/cmsdist#9034

This commit fixes the concurrency problem by doing the buffering in the
DataMaps that are used in the `RooFit::Evaluator`. Like this, multiple
computation graphs can be concurrently evaluated.

It was tested with the ATLAS benchmarks in `rootbench` that the fitting
performance remains the same.
The Batches classes in RooBatchCompute have only public data members
anyway. It's not necessary to have extra member functions for changing
them.
@phsft-bot
Copy link
Collaborator

Build failed on windows10/default.
Running on null:C:\build\workspace\root-pullrequests-build
See console output.

Errors:

  • [2024-03-04T21:36:56.155Z] C:\build\workspace\root-pullrequests-build\root\interpreter\llvm-project\llvm\include\llvm/Support/AlignOf.h(29,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm-project\clang\lib\AST\DeclBase.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm-project\llvm\tools\clang\lib\AST\obj.clangAST.vcxproj]
  • [2024-03-04T21:36:56.155Z] C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\include\vector(1728,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm-project\clang\lib\Analysis\ReachableCode.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm-project\llvm\tools\clang\lib\Analysis\obj.clangAnalysis.vcxproj]
  • [2024-03-04T21:36:56.155Z] c1xx : fatal error C1356: unable to find mspdbcore.dll [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm-project\llvm\lib\Remarks\LLVMRemarks.vcxproj]
  • [2024-03-04T21:36:56.155Z] c1xx : fatal error C1356: unable to find mspdbcore.dll [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm-project\llvm\lib\Remarks\LLVMRemarks.vcxproj]
  • [2024-03-04T21:36:56.155Z] c1xx : fatal error C1356: unable to find mspdbcore.dll [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm-project\llvm\lib\Remarks\LLVMRemarks.vcxproj]
  • [2024-03-04T21:36:56.155Z] C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\include\xstring(4033,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm-project\clang\lib\Basic\Targets\X86.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm-project\llvm\tools\clang\lib\Basic\obj.clangBasic.vcxproj]
  • [2024-03-04T21:36:56.155Z] C:\build\workspace\root-pullrequests-build\build\interpreter\llvm-project\llvm\include\llvm/Frontend/OpenMP/OMP.inc(4698,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm-project\clang\lib\Parse\ParseAST.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm-project\llvm\tools\clang\lib\Parse\obj.clangParse.vcxproj]
  • [2024-03-04T21:36:56.155Z] c1xx : fatal error C1356: unable to find mspdbcore.dll [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm-project\llvm\lib\Remarks\LLVMRemarks.vcxproj]
  • [2024-03-04T21:36:56.155Z] c1xx : fatal error C1356: unable to find mspdbcore.dll [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm-project\llvm\lib\Remarks\LLVMRemarks.vcxproj]
  • [2024-03-04T21:36:56.155Z] C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\include\tuple(300,5): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm-project\clang\lib\Parse\ParseStmtAsm.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm-project\llvm\tools\clang\lib\Parse\obj.clangParse.vcxproj]

And 71 more

@guitargeek
Copy link
Contributor Author

The failures on Windows Jenkins are the occasional out-of-heap-space of the compiler. It has nothing to do with this PR, and the new CI is green for Windows 32 bit and 64 bit.

Copy link
Member

@lmoneta lmoneta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
I trust you Jonas on these important improvements and fixes

@guitargeek guitargeek merged commit 30a4fd7 into root-project:master Mar 5, 2024
15 of 17 checks passed
@guitargeek guitargeek deleted the batchcompute_no_vectors branch March 5, 2024 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants