Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put in tests for sampler #248

Merged
merged 108 commits into from
Apr 11, 2024
Merged

Put in tests for sampler #248

merged 108 commits into from
Apr 11, 2024

Conversation

vlkale
Copy link
Contributor

@vlkale vlkale commented Apr 4, 2024

This PR addresses the broader Github Issue #191, which is important for the open-source development of Kokkos Tools.

The purpose of this PR is to enhance and ensure the Kokkos Tools sampler robustness by adding tests for it. Testing the sampler is important given that:

(1) It is a parent tool to other tools connectors, i.e., other tools using it will not work if this tool does not call them correctly;
(2) It uses the parts of Kokkos Tools infrastructure that the sampler must exercise to maintain low overhead, i.e., tool-invoked fencing.

This PR adds a directory in tests for the sampler utility and tests that the sampler works correctly. Specifically, it checks that:

  1. every begin callback from the sampler has a matching end callback; and
  2. the sampler chooses the correct kernel IDs to sample based on the skip rate

The tests use CTest, and tries to follow the demangling tests for space time stack, but there are important changes needed to the general CMakeLists.txt file to facilitate testing of the sampler.

Note that corresponding testing will be needed for the kernel filter, which would be a separate PR. Testing for all tool connectors and this needs to be a separate or multiple separate PRs.

vlkale added 30 commits April 1, 2024 15:53
View  index access of x does not use []
The typo was kokkospk_end_parallel_scan. This causes the third test for the sampler to fail.
@vlkale
Copy link
Contributor Author

vlkale commented Apr 10, 2024

Your test is really weak. It will pass if the sampler does nothing and just always forward to the underlying tool.

@dalg24 Thanks. However, I do not believe this is the case: It will not pass if the sampler does nothing. In my test code, all matchers must be seen for this to pass. Also, the EXPECT_THAT function I have will cause a test to fail if the regular expression is not seen. This is the pertinent Google Tests documentation

https://google.github.io/googletest/reference/matchers.html

@dalg24 On second thought, you may be right - I re-read in the documentation that the matcher matches a single argument. The safest way to fix this is to just match the exact and full string I want to match in one matcher.

Assume we merge your test as it is proposed now and a follow PR includes this change.

diff --git a/common/kokkos-sampler/kp_sampler_skip.cpp b/common/kokkos-sampler/kp_sampler_skip.cpp

index 78cbbff..86b55e3 100644

--- a/common/kokkos-sampler/kp_sampler_skip.cpp

+++ b/common/kokkos-sampler/kp_sampler_skip.cpp

@@ -193,7 +193,7 @@ void kokkosp_begin_parallel_for(const char* name, const uint32_t devID,

                                 uint64_t* kID) {

   *kID                          = uniqID++;

   static uint64_t invocationNum = 0;

-  ++invocationNum;

+  invocationNum;

   if ((invocationNum % kernelSampleSkip) == 0) {

     if (tool_verbosity > 0) {

       std::cout << "KokkosP: sample " << *kID

Would you not want your test to fail?

I see you are showing a mistake by the programmer where they have not incremented the invocation number.

The ideal way to test to cover that is to create a regular expression in RE2 that captures the fact that there exists no samples between 0 and 149 that aren't 51 or 102.

However, as I was reading about RE2 last week in documentation and stackoverflow, there is a known problem with regular expression lookahead on some architectures and this would not be portable.

I think one easier way for now is to literally write an EXPECT_THAT for every single string that should not be in the output because sampling eliminated them, using Not(HasSubstr()). I had done that for three of them in my commit on Friday. Maybe even easier: reduce the number of invocations to 15, still taking two samples, and test the above by writing just 13 EXPECT_THAT's that test that those 13 samples haven't been taken. I will create this alternate simpler test to handle this. We can use this even easier one to test what you say.

I have a broader question here: What role should tests for Kokkos Tools play?

In the code you show above, the tool connector programmer shouldn't have written that in the first place.

I think the tests for Kokkos Tools should emphasize more catching unexpected problems at runtime within the CI environments that are hard for tools connectors programmers to grasp without understanding Kokkos Tools and infrastructure. The null fence function pointer is just one example of this. The bug you bring up is important too, but it is another category of bugs involving a simpler programming error.

I think that testing of Kokkos Tools should follow testing in Kokkos core. A difference between Kokkos Tools and Kokkos core is that Kokkos core has more externally contributed libraries/connectors rather than contributions by Kokkos developers. That's why testing is very important in Kokkos Tools. I would say documentation for a Kokkos Tools connector developers (on how to write any Kokkos Tools connector) to complement testing Kokkos Tools may be a good thing to do.

@dalg24
Copy link
Member

dalg24 commented Apr 10, 2024

I have a broader question here: What role should tests for Kokkos Tools play?

Ensure that the tools behave as they are supposed to and are compatible with Kokkos Core.

If the tests we provide do not fail for the change I suggested above, then these tests are no good.
There are other ways to ensure that the tool output is right that do not involve complex regular expressions.

@vlkale
Copy link
Contributor Author

vlkale commented Apr 10, 2024

I have a broader question here: What role should tests for Kokkos Tools play?

Ensure that the tools behave as they are supposed to and are compatible with Kokkos Core.

OK. This is consistent with the way I see it at a high-level, and this captures it concisely. I will make sure the tests address these two aspects.

By the way, I think Kokkos Tools connector documentation should also lay out this high-level requirement of testing for Kokkos Tools connectors developers so they can write out their own tests - this can be looked at later though.

If the tests we provide do not fail for the change I suggested above, then these tests are no good.

That makes sense and sounds good. I guess you were showing that example for simplicity. Maybe another possibility is that the code is correct but the variable invocationNum gets overwritten due to memory corruption from system-level/OS events. Anyway, it makes sense to create a test that invocationNum is indeed getting incremented appropriately.

There are other ways to ensure that the tool output is right that do not involve complex regular expressions.

Yes, I will do without. A regex can simplify code and repetition, but it's looking like not here.

@vlkale
Copy link
Contributor Author

vlkale commented Apr 11, 2024

@dalg24 @masterleinad I think I have answered most of your questions. The CI tests are now passing with the changes. Please let me know if there is still anything unresolved. I have left all comments unresolved until you have looked at it.

@masterleinad I am keeping std::cout for this PR for the fatal error, see comment above on why.

@crtrott crtrott merged commit c57eaa2 into kokkos:develop Apr 11, 2024
7 checks passed
vlkale added a commit to vlkale/kokkos-tools that referenced this pull request Apr 11, 2024
* CMakeLists.txt: add_library to kp_add_library

Sampler's CMakeLists.txt: add_library to kp_add_library

* kp_core.hpp: remove pointer from ptpi

* kp_sampler_skip.cpp: pass by value kokkosp_p_t_p_i

kp_sampler_skip.cpp: last parameter should be passed by value rather than pointer in kokkosp_p_t_p_i

* kp_sampler_skip.cpp: remove dereference assignment for ptpi callback

* kp_core.hpp: apply clang-format

* Create test_sampler.cpp

* CMakeLists.txt: set up test for sampler

* test_sampler.cpp: put in code for sampler test

* test_sampler.cpp: typo in comment (2 invocation --> 2 invocations)

* CMakeLists.txt: support sampler test

* kp_sampler_skip.cpp: change printf to std::out for ctests

* kp_sampler_skip.cpp: include iostream for std::cout

* Update kp_sampler_skip.cpp: \n instead of std::endl

\n is faster for performance

* kp_sampler_skip.cpp: apply clang format

* kp_sampler_skip.cpp: fix for std::out of tool-invoked fence verbose debug print

* kp_sampler_skip.cpp: apply clang format

* Rename test_sampler.cpp to test_parfor.cpp

* Update CMakeLists.txt: reduce and scan sampling tests

* Create test_parreduce.cpp

* Create test_parscan.cpp

* Update test_parreduce.cpp: parallel_reduce function 

Test based on example shown at: https://kokkos.org/kokkos-core-wiki/API/core/parallel-dispatch/parallel_reduce.html

* test_parscan.cpp: put in test for sampling parallel_scan

Parallel scan sampling test based on example here: 

https://kokkos.org/kokkos-core-wiki/API/core/parallel-dispatch/parallel_scan.html

* Update test_parreduce.cpp: fix x[i] to x(i) 

View  index access of x does not use []

* Update test_parreduce.cpp: Kokkos:: for View

* test_parreduce.cpp: reduce lambda

* Update test_parscan.cpp: operator for cuda/hip build

* test_parscan.cpp: support scan test function

* test_parreduce.cpp: reduction operator second argument 

second argument

* test_parscan.cpp: fix scan test operator

* Update test_parscan.cpp: fix scan test to have to Views

* test_parscan.cpp: policy to size

* Update kp_kernel_logger.cpp: fix typo for scan callback

The typo was kokkospk_end_parallel_scan. This causes the third test for the sampler to fail.

* test_parscan.cpp: apply clang format

* test_parreduce.cpp: apply clang format

* CMakeLists.txt: add_library -> kp_add_library

* kp_sampler_skip.cpp: put in prob samplr

* CMakeLists.txt: sampling prob test

* Create test_parfor_prob.cpp

* Create test_parreduce_prob.cpp

* Create test_parscan_prob.cpp

* test_parfor_prob.cpp: edit matchers for prob sampling

* test_parreduce_prob.cpp: fix matchers

* Update test_parscan_prob.cpp: fix parscan test matchers

* Update test_parscan_prob.cpp: fix test comments

* test_parfor_prob.cpp: fix comments

* Update test_parreduce_prob.cpp: fix comments

* test_parfor.cpp: update matcher

* Update test_parscan.cpp: fix matchers

* test_parreduce.cpp: matchers fix

* Update kp_sampler_skip.cpp: add cout for end_parallel_xxx

* Update test_parfor.cpp: put in finished with end in matchers

* Update test_parscan.cpp: update matchers / comments removal

* test_parreduce.cpp: fix matchers

* Update kp_sampler_skip.cpp: fix elipses

* test_parscan.cpp: apply clang format

* CMakeLists.txt: skip rate is 0

* test_parscan.cpp: apply clang format

* test_parfor_prob.cpp: apply clang format

* Update README.md: add sampler entry

* test_parreduce_prob.cpp: apply clang format

* kp_sampler_skip.cpp: apply clang format

* README.md: sampler README for probability

* test_parreduce.cpp: fix int ref in operator

* test_parscan.cpp: int ref in operator

* test_parscan.cpp: long int in signature

* test_parreduce.cpp: long int in second argument to operator

* Update test_parreduce.cpp: change sum to int type

* Update test_parreduce.cpp: declare sum as long int

* Update kp_kernel_logger.cpp: kokkospk_end_parallel_scan --> kokkosp_end_parallel_scan

* Update README.md

Co-authored-by: Daniel Arndt <[email protected]>

* Update test_parfor.cpp

* test_parfor.cpp: put in fence test

* Update test_parfor.cpp: put in checks for number of calls and Null Ptr fence

* test_parfor.cpp: apply clang format

* fix test par for

* test_parfor.cpp: apply clang format

* kp_sampler_skip.cpp: fix sampler std::cout prints for test

* test_parfor.cpp: apply clang format

* test_parfor.cpp: fixing matcher string for number contains

* Update test_parfor.cpp: not substr

* test_parfor.cpp: apply clang format

* kp_sampler_skip.cpp: apply clang format

* test_parfor.cpp: apply clang format

* test_parfor.cpp: apply clang format

* Update test_parfor.cpp

* test_parfor.cpp: apply clang format

* test_parfor.cpp: Times function

* delete file accidentally put in tests directory

* Update test_parfor.cpp: remove AtMost

* Update test_parfor.cpp: remove using AtMost

* kp_sampler_skip.cpp: put back in

* Update test_parreduce.cpp

* test_parfor.cpp: remove count for number of times

* Update test_parscan.cpp

* Update test_parfor.cpp: removing Contains

* kp_sampler_skip.cpp: apply clang format

* test_parfor.cpp: apply clang format

* test_parreduce.cpp: apply clang format

* test_parscan.cpp: apply clang format

* Update test_parfor.cpp: putting in times

* test_parfor.cpp: apply clang format

* Update test_parreduce.cpp: put in times

* test_parscan.cpp: put in times test

* test_parscan.cpp: apply clang format

* test_parreduce.cpp: apply clang format

* test_parscan.cpp: apply clang format

* test_parscan.cpp: apply clang format

* test_parreduce.cpp: apply clang format

* test_parfor.cpp: apply clang format

* test_parfor.cpp: apply clang format

* test_parreduce.cpp: apply clang format

* test_parscan.cpp: apply clang format

* kp_sampler_skip.cpp: std::cout to std::cerr

Co-authored-by: Daniel Arndt <[email protected]>

* test_parscan.cpp: fix declaration for variable result

Co-authored-by: Damien L-G <[email protected]>

* test_parfor.cpp: revisions of tests PR review

* test_parreduce.cpp: fix with new tests

* test_parscan.cpp: fix sampler scan

* test_parreduce.cpp: fix with correct samples and comments

* CMakeLists.txt: fix sampler skip rate

* test_parfor.cpp: add string header

* test_parreduce.cpp: add string header

* test_parscan.cpp: add string header

* Update test_parfor.cpp: fix output

* test_parreduce.cpp: fix test

* test_parscan.cpp: fix occurrences variable

* test_parfor.cpp: apply clang format

* test_parreduce.cpp: apply clang format

* test_parscan.cpp: apply clang format

* test_parreduce.cpp: apply clang format

* test_parfor.cpp: fix sampler number from 11 to 12

* test_parreduce.cpp: sample number from 11 to 12

* test_parscan.cpp: change sample from 11 to 12

* kp_sampler_skip.cpp: revert cerr to cout

* CMakeLists.txt: make global fences 0

* CMakeLists.txt: remove setting of global fence environment variable

* Put in tests for sampler  (kokkos#248)

* CMakeLists.txt: tests for prob sampling

Without fence and with fence

* parreduce.hpp: file for parallel reduce

* Update parreduce.hpp: putting once pragma for header

* parfor.hpp: header file for parfor

* parfor.hpp: put in code for Kokkos parallel for test

* Create parscan.hpp: parscan code file

* parscan.hpp: put in code for scan

* test_parreduce.cpp: deleting parreduce header

* test_parscan.cpp: using header for parscan

* test_parfor.cpp: parfor header

* test_parfor_prob.cpp: header

* test_parreduce_prob.cpp: putting in parreduce hpp

* test_parscan_prob.cpp: fix includes

* test_parreduce_prob.cpp: HasSubStr to HasSubstr

* new MatchersProb based on random seed two from CI build test OpenMP

* matchersProb.hpp: update

* test_par*_prob.cpp: apply clang format

* test_parfor.cpp: apply clang format

* matchersProb.hpp: fix to matchers - apply clang format

* matchersProb.hpp: fix to matchers semicolon apply clang format

* matchersProb.hpp: fix to matchers semicolon apply clang format

---------

Co-authored-by: Christian Trott <[email protected]>
Co-authored-by: Damien L-G <[email protected]>
Co-authored-by: Daniel Arndt <[email protected]>
Co-authored-by: Damien L-G <[email protected]>
@vlkale vlkale deleted the test-for-sampler branch June 6, 2024 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants