You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It should be possible to speed up compilation of Urho3D by better optimizing which headers are pre-compiled. Presently, only a few of the Container includes are (HashMap, HashSet, Sort, and Str). However, we can use clang's -ftime-trace flag to have it output a trace of the compilation process, which includes times spend including other files.
This article details the overall process fairly well, including how to use the ninja build system to generate a trace over the entire build. It also provides some python code to combine the individual time-trace files into one, which I have slightly edited and copied below (I forced all compilations to use the same process/thread ids so they are all grouped together, and resolved all relative includes to the absolute paths).
Below are a screenshot the results for one build (not including ThirdParty/Tools/etc., just the Urho3D library itself). The main takeaways I have from this are that we may want to pre-compile:
MathDefs.h: including <cmath.h> is apparently slow
Variant.h: used very frequently, though almost half it's time is from the eventual MathDefs.h include
StringUtils.h (and/or fmt instead, as that is the slow part).
Object.h: Slowed by <functional>
Maybe also Quaternion.h, which is slowed by the <emmintrin.h> include, though not as much as MathDefs.h
Optimistically, these changes might shave several minutes of a single-threaded build, so I think it is worth further investigation.
Code to combine different time traces, if you want to try it yourself. Simply add -ftime-trace to CMAKE_CXX_FLAGS for a clang build, rebuild everything, and then run the script from u3d/clang-build/Source/Urho3D/CMakeFiles/Urho3D.dir/ with all the traces, e.g. merge-clang-ftime-trace.py `find -name '*.cpp.json'`
#!/usr/bin/env python3"""Combine JSON from multiple -ftime-traces into one.Run with (e.g.): python combine_traces.py foo.json bar.json."""importjsonimportsysfromos.pathimportabspathFORCE_ID=1337if__name__=='__main__':
start_time=0combined_data= []
forfilenameinsys.argv[1:]:
withopen(filename, 'r') asf:
file_time=Noneforeventinjson.load(f)['traceEvents']:
# Skip metadata events# Skip total events# Filter out shorter events to reduce data sizeifevent['ph'] =='M'orevent['name'].startswith('Total') orevent['dur'] <1000:#5000:continueifevent['name'] =='ExecuteCompiler':
# Find how long this compilation takesfile_time=event['dur']
# Set the file name in ExecuteCompilerif'args'notinevent:
event['args'] = {}
event['args']['detail'] =filename# Merge all compiler calls into one process/thread so they are all shown together.ifFORCE_IDisnotNone:
event['pid'] =FORCE_IDevent['tid'] =FORCE_ID# Resolve relative paths so they are counted as the same from different base files.ifevent['name'] =='Source'and'args'ineventand'detail'inevent['args']:
event['args']['detail'] =abspath(event['args']['detail'])
# Offset start time to make compiles sequentialevent['ts'] +=start_time# Instead could try only use a sequential build# Add data to combinedcombined_data.append(event)
# Increase the start time for the next file# Add 1 to avoid issues with simultaneous eventsstart_time+=file_time+1withopen('combined.json', 'w') asf:
json.dump({'traceEvents': sorted(combined_data, key=lambdak: k['ts'])}, f)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
It should be possible to speed up compilation of Urho3D by better optimizing which headers are pre-compiled. Presently, only a few of the Container includes are (HashMap, HashSet, Sort, and Str). However, we can use clang's
-ftime-trace
flag to have it output a trace of the compilation process, which includes times spend including other files.This article details the overall process fairly well, including how to use the ninja build system to generate a trace over the entire build. It also provides some python code to combine the individual time-trace files into one, which I have slightly edited and copied below (I forced all compilations to use the same process/thread ids so they are all grouped together, and resolved all relative includes to the absolute paths).
Below are a screenshot the results for one build (not including ThirdParty/Tools/etc., just the Urho3D library itself). The main takeaways I have from this are that we may want to pre-compile:
<cmath.h>
is apparently slow<functional>
<emmintrin.h>
include, though not as much as MathDefs.hOptimistically, these changes might shave several minutes of a single-threaded build, so I think it is worth further investigation.
Code to combine different time traces, if you want to try it yourself. Simply add
-ftime-trace
toCMAKE_CXX_FLAGS
for a clang build, rebuild everything, and then run the script fromu3d/clang-build/Source/Urho3D/CMakeFiles/Urho3D.dir/
with all the traces, e.g.merge-clang-ftime-trace.py `find -name '*.cpp.json'`
You can then view the traces in https://www.speedscope.app/ or in Chrome's
about://tracing
page.Beta Was this translation helpful? Give feedback.
All reactions