Replies: 3 comments 1 reply
-
I'm bumping this and marking the discussion as a gem5-dev discussion, this might be more relevant. |
Beta Was this translation helpful? Give feedback.
-
Thanks for bumping this, I think I missed it originally. I assume this is somewhat related: #1534 My view is that it's the intention of gem5 to be deterministic, and we shouldn't depend on docker (etc.) to make this guarantee. (KVMCPU is an outlier and is non-deterministic because it depends on the host hardware. I don't see any way around that.) I am in favor of your suggested fix. |
Beta Was this translation helpful? Give feedback.
-
Looks good to me. An additional step would be to stop using I am not sure it is technically possible in C++ to forbid a user doing that, as uint32_t and uint64_t are simple aliases to unsigned or unsigned long. However we can start amending existing code and be vigilant on future PRs |
Beta Was this translation helpful? Give feedback.
-
Hello all,
My student and I ran into an unexpected issue running the same gem5 code, FS image and checkpoints on two machines with different environments:
In both cases, I am running an old gem5 (4ef1f17e0f9c6f15b8ad63eff7a2d07025c29709 from 2020), but after looking at current stable and develop, I think the problem is still there. The problem is that when running a simulation in FS mode, single core (O3CPU), restoring a simpoint, and classic memory model, I get different stats (including IPC).
The reason, it appears, is in
src/base/random.hh
. A bunch of components (notably the branch predictors, some replacement policies, the fetch stage of O3CPU) use therandom_mt
object of typeRandom
to obtain random values. Under the hood,random_mt
relies on a member of typestd::mt19937_64
calledgen
to provide numbers. Now, that is portable becausestd::mt19937_64
is not left to the implementation and follows a specific algorithm with specific initial values.However, instead of providing a random number by calling
operator()
ongen
, we go through astd::uniform_int_distribution
in case we want an number whose range is smaller than the 64 bit provided bygen()
, for instance like the fetch stage of O3CPU is doing (src/cpu/o3/fetch.cc:883 on stable)It turns out that
std::uniform_int_distribution
is implementation-dependent (https://www.ida.liu.se/~TDDD38/ISOCPP/rand.dist.general.html or point 29.6.8.1.3 of the standard). If I change the code insrc/base/random.hh
from (what is currently in stable and develop):to:
Then I get the same result on both machines, because the RNG is deterministic and the max value is the same on gcc and clang for
uint8_t
. I'm not entirely sure how to deal with the FP version (haven't thought about it very hard), and the quality of the randomness may not be the same with this technique compared to usingstd::uniform_int_distribution
.So, the general discussion would be : Is this problem supposed to be handled by using docker or do we want to strive for two runs of the same code and inputs providing identical results on different machines? I think the latter is good to have to avoid surprises for people that have heterogeneous infrastructures and don't use Docker (or similar) by default. Moreover, it's an easy fix in this case, assuming the fix provides "random enough" numbers for whatever the components are using them.
I have no idea if that would still hold in multithreaded runs (I feel like it should?), and I suspect it may not work with KVMCPU but I know I would like to have this feature. I agree this may be painful to make sure there is no regression because you would essentially need to setup CI with two compilers...
Any thoughts?
Beta Was this translation helpful? Give feedback.
All reactions