You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I know that the data packing lab is marked as broken as I can't get the about 20% speed up as mentioned in the video too, however I do get about 3-8% speed up when using clang 17 on windows. Maybe we can investigate further about the current state of this lab ?
The text was updated successfully, but these errors were encountered:
Hi @jsjtxietian , sure, if you're interested, feel free to investigate. I'm currently very busy, so I won't be able to look into this in the next 1-2 months.
The following data is collected when N= 50000 and iteration time is 10000, on windows11 using vtune with clang ver 17.0.6
(Note: I can not get reliable opt effect when using the origin N's config)
Running hotspot analysis shows the time saving mainly comes from std::shuffle:
Microarchitecture exploration shows a little decrease in backend bound:
Something I observe when comparing hardware events:
Hi thanks for the great lab.
I know that the data packing lab is marked as broken as I can't get the about 20% speed up as mentioned in the video too, however I do get about 3-8% speed up when using clang 17 on windows. Maybe we can investigate further about the current state of this lab ?
The text was updated successfully, but these errors were encountered: