-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve opt_merge
performance
#4175
Conversation
Cool. I'll run our benchmarks on it when I get a chance. |
@rmlarsen Interesting. There seem to be much more time spent in |
@povik yeah, that is odd. Is the |
@povik yeah
You end up combining a lot of those potentially. So the hashing is definitely weaker. |
I wonder if pulling in xxhash or cityhash would be an improvement here. |
BTW: This code is really weird. Line 39 in 1b73b5b
|
passes/opt/opt_merge.cc
Outdated
hash_conn_strings.push_back(s + "\n"); | ||
conn_hash = mkhash(a.hash(), acc); | ||
} else { | ||
for (auto conn : cell->connections()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite grok what's going on here, but it seems quite different from the original hashing of connections. Could this be the source of more collisions?
Right, that doesn't make much sense. |
Turns out the collisions are due to a combination of |
Wait, what? ILP64 may not be popular but since when does the C++ standard ban it? |
Second closer look at https://en.cppreference.com/w/cpp/language/types indeed confirms the standard allows int be 64-bit, the page just doesn't list it as an option in the "common data models" table. I will revert the |
On second thought those ILP64 being so rare there's no point keeping a special branch for them, if the default branch is basically fine too. I think I will keep the change in. |
It's currently the only way to get around the limitation on design size, which people do run into somewhat regularly. Lines 203 to 204 in d00843d
I don't know if anyone has taken this path in practice, but from past conversations in issues I suspect some people patched yosys locally to replace "int" with "long". We might want to go that way long-term too... we've stayed away from it so far out of fear of introducing more subtle issues that our tests won't be able to uncover, but given how we keep finding that the hash wasn't working properly in the first place maybe we should just go for it. Either way we should probably take that discussion to a separate issue/PR. |
@nakengelhardt @povik if this is indeed something people struggle with in the Yosys, I'd highly recommend going with the standard types int32_t and int64_t etc. We don't have to suffer these indignities anymore. |
How does that help, what benefit do you get from requiring a precise size rather than a minimum size here? |
Well defined semantics of your code, perhaps? |
But it doesn't seem like the lack of defined semantics is the issue here, or something anyone is struggling with. It seems like the underspecification of |
Sure. But in this instance, the code is less readable and has different behavior on different platforms, which should be avoided unless absolutely necessary IMHO. |
Yes, we do need configurability of the hash size for different use cases. The current way of doing it has just fallen out of favor since (or possibly before) this code was first written. Ideally we would have 64b hash be the default on 64b architectures, but we need to maintain the option of 32b hashes for smaller/slower architecture targets where the performance hit matters (people running yosys on an older raspi are still a thing, and then there's that proof-of-concept someone made of a softcore doing synthesis to partially reconfigure the FPGA it is running on, which we want to enable just for the heck of it). So there's no way around it that any caller of the function has to work for all sizes anyway. |
|
I see. [This is C++, so this is usually done with explicit overloads, not by adding a runtime check for |
Noooooooo! ;-) |
What's the purpose of this custom hash function in the first place? It seems like adopting an existing function that works on char* streams would both be platform independent, and faster than what we've done here. |
Moreover, in the code in question, it is also being used to combine hashes for which this hash function is very poor. There should probably be a separate hash function for that purpose. IFAICT, DJB2 is a very simple hashing for strings. |
kernel/rtlil.h
Outdated
@@ -712,7 +712,7 @@ struct RTLIL::Const | |||
inline unsigned int hash() const { | |||
unsigned int h = mkhash_init; | |||
for (auto b : bits) | |||
mkhash(h, b); | |||
h = mkhash(h, b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
DJB2 didn't make the leaderboard? Notwithstanding we are probably using it wrong in the changed |
Has anyone actually benchmarked this and found that the hash function is an issue? Or is this conjecture? |
It is not a hotspot in my benchmarks, but this was measured on Intel Skylake x86_64. The main issue in this context was the poor hashing, which caused a significant number of hash collisions. But I think Martin fixed the primary source of that (lack of hashing of constants). |
I assume Catherine's question was about hashing in general all over Yosys code, and the associated
FWIW the poor properties of |
Ah yes. Hash table manipulation in general (but not FWIW, we turned on more ABC passes since my original PRs, so ignore those. But in yosys proper, the largest time sinks are hashing related. |
I believe the hot spots in my profile are related to the non-coherent layout of the hash tables in yosys, not time spent in the hash function itself. Replacing the hash table implementation with something like the swisstables from Abseil would likely give a significant speedup. I have not had the time to experiment with this, as it is a rather invasive change. https://abseil.io/about/design/swisstables |
@povik here is the flame graph for most of the Yosys cost corresponding to the "Bottom up" list above. It's a mix of OptMergePass/OptExprPass/CleanupPass/OptMuxTreePass. |
Avoid building a string which we subsequently hash when hashing cells, instead use the readily available `hash()` on IDs and `SigSpec` and combine those to build the overall cell hash.
There's little value in treating those with `opt_merge` but they can be relatively expensive to hash if their init data is large in size.
0790bfb
to
379c462
Compare
Superseded by Emil's work in #4677 making opt_merge use a new hashing interface |
See the commits. This replaces the hashing implementation, and makes it ignore cells holding memory initialization data.
Ping @rmlarsen since they were interested in
opt_merge
performance before.