Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for offline callstack symbol resolving #665

Merged

Conversation

tiago-rodrigues
Copy link

@tiago-rodrigues tiago-rodrigues commented Nov 17, 2023

Add support for offline symbol resolving by setting the "TRACY_SYMBOL_OFFLINE_RESOLVE=1" env var.
Add symbol resolving functionality to the "update" tool, with
offline symbol resolvers for linux (using addr2line) and windows (using dbghelper).

There are a few reasons why resolving symbols offline might be useful:

  • runtime symbol resolving can be slow and use quite a bit memory - specially on windows (which is counter productive when we are doing tracy captures to investigate memory usage and we have a lot of backtrace traffic as we instrument every memory allocation with a an interposer and inject backtraces for each alloc/free).

  • libbacktrace does not work with dlopenned shared libraries after the first symbol resolve has happened, this can be quite problematic on applications that rely heavily on dynamically loaded libs like Omniverse (we can have ~200 shared libs loaded after starting our app due to the way the extension system works. Still I'll investigate adding a potential solution for this in another PR where we refresh libbacktrace internal state.

  • doing offline symbol resolving can allow a bit more flexibility on retargeting the resolving to a different set shared libraries (if symbols have been stripped for example) etc.

@wolfpld
Copy link
Owner

wolfpld commented Nov 17, 2023

I haven't added documentation about this in the PR already as I wanted to float the idea of doing the offline symbol resolving first and see if there were any major objections to it first.

Looks like a very nice addition, judging by the description. Go ahead and write the documentation. I'm not sure about the name "tracy-edit" though, but nothing comes to mind that has a nice ring to it.

I'll do a proper review later.

@YaLTeR
Copy link
Contributor

YaLTeR commented Nov 18, 2023

Oh, this would be quite useful for GNOME Shell profiling. I currently have to disable symbol resolution there because GJS seems to very heavily contend on dlsym() with Tracy's symbol workers, causing slowdowns. And callstacks are very useful.

manual/tracy.tex Outdated
the minimal set of info required for offline resolution (a shared library path and an offset into that shared library).

The generated tracy capture will have callstack frames symbols showing \texttt{[unresolved]}.
The \texttt{tracy-edit} tool can be used to load that capture, perforce symbol resolution offline
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perforce -> perform

@@ -93,6 +93,14 @@ extern "C" const char* ___tracy_demangle( const char* mangled )
namespace tracy
{

// when "TRACY_SYMBOL_OFFLINE_RESOLVE" is set to "1", instead of fully resolving symbols at runtime,
// simply resolve the offset and image name (which will be enough the resolving to be done offline)
bool getDoOfflineSymbolResolve()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be a good idea to add a #ifdef TRACY_SYMBOL_OFFLINE_RESOLVE to enable hardcode enabling this at build time, like you can do with some other options. Also, rename this function to ShouldResolveSymbolsOffline.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, I'll try to add support for the compile time ifdef in another commit later today/tomorrow as it will require a bit of #ifdefing in that file to make sure the libbacktrace/dbghelper is ifdefed out

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the ifdef as an option to the user, but also keep the current env variable based solution.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see you already committed the changes. I'm not sure it's worth having ifdefs scattered all over the code. Sure, it's dead code, but it also increases the mental effort required to understand what's going on.


DWORD64 dbgHelpLoadSymbolsForModule(PCSTR imageName, DWORD64 baseOfDll, DWORD bllSize)
{
if( !s_doOfflineSymbolResolve )
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if( s_doOfflineSymbolResolve ) return 0;

if( s_doOfflineSymbolResolve )
{
return "[unresolved]";
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write as a one-liner.


void InitCallstackCritical()
{
___tracy_RtlWalkFrameChain = (___tracy_t_RtlWalkFrameChain)GetProcAddress( GetModuleHandleA( "ntdll.dll" ), "RtlWalkFrameChain" );
}

void InitCallstack()
void dbgHelpInit()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep function names capitalized.

#endif
}

DWORD64 dbgHelpLoadSymbolsForModule(PCSTR imageName, DWORD64 baseOfDll, DWORD bllSize)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow the convention wrt to how spaces are used, etc.

// when doing offline symbol resolution, we must store the full path of the dll for the resolving to work
if( s_doOfflineSymbolResolve )
{
cachedModule->name = (char*)tracy_malloc_fast(imageNameLength + 2);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there need for two additional characters?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, it was a copy paste from the code below - fixed

@@ -2411,6 +2426,19 @@ const char* Worker::GetString( const StringIdx& idx ) const
return m_data.stringData[idx.Idx()];
}

uint32_t Worker::AddNewString(const char* newString)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a need for a new function to add strings?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need really, the only thing diff was this one was public and had that extra assert.
I've removed it and made StoreString public instead.

if (resolver)
{
PatchSymbols(resolver, worker, pathSubstitutionList, args.verbose);

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove newline.

return false;
}

#endif // #ifdef _WIN32
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add EOF newlines.

const FrameEntryList& inputEntryList,
SymbolEntryList& resolvedEntries)
{
if( resolver )
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if( !resolver ) return false;

void DestroySymbolResolver(SymbolResolver* resolver)
{
delete resolver;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is new and delete relegated to functions here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code was a bit different before when I had more state in SymbolResolver which is a private type in the CPP. There is no need for this right, removed it.

@wolfpld
Copy link
Owner

wolfpld commented Nov 19, 2023

The conflicting gcc.yml has been split into linux.yml and macos.yml, you should be able to just copy and paste your change to resolve the conflict. See 906f73c.

@tiago-rodrigues tiago-rodrigues force-pushed the trodrigues/offline_symbol_resolve branch from 8b8815b to e04e595 Compare November 19, 2023 14:37
…e linux github workflow (but not locally, likely due to diff compiler)
@tiago-rodrigues
Copy link
Author

Oh, this would be quite useful for GNOME Shell profiling. I currently have to disable symbol resolution there because GJS seems to very heavily contend on dlsym() with Tracy's symbol workers, causing slowdowns. And callstacks are very useful.

@YaLTeR: although it won't call into libbacktrace anymore to resolve symbols, not sure if it will still have some contention as it still uses dladdr() (to get the .so path and base address)? If so, maybe maintaining our own .so range mapping and updating with dl_iterate_phdr when we we get a request that doesn't land in any known ranges, would be a way to avoid it.

@YaLTeR
Copy link
Contributor

YaLTeR commented Nov 19, 2023

Hmm, if it still uses dladdr() then I suspect it will still contend :/

@tiago-rodrigues tiago-rodrigues force-pushed the trodrigues/offline_symbol_resolve branch from 1ebf8fe to 6c0bb0a Compare November 20, 2023 20:53
@tiago-rodrigues
Copy link
Author

tiago-rodrigues commented Nov 20, 2023

Hmm, if it still uses dladdr() then I suspect it will still contend :/

It's probably best to tackle this in another PR? However, I added a possible solution here tiago-rodrigues@08db7d3, adding an image range cache to avoid calling dladdr() when doing offline symbol resolution (and dl_iterate_phdr is supported). It should only refresh when an address is encountered that's outside of the known ranges. It is also used for the runtime symbol resolution case to re-create libbacktrace state and allow symbol resolution for dynamically loaded shared libs (at the cost of mem leaking when it refreshes as I can't see any libbacktrace state freeing/recreation.)

CMakeLists.txt Outdated
@@ -85,6 +85,7 @@ set_option(TRACY_FIBERS "Enable fibers support" OFF)
set_option(TRACY_NO_CRASH_HANDLER "Disable crash handling" OFF)
set_option(TRACY_TIMER_FALLBACK "Use lower resolution timers" OFF)
set_option(TRACE_CLIENT_LIBUNWIND_BACKTRACE "Use libunwind backtracing where supported" OFF)
set_option(TRACY_SYMBOL_OFFLINE_RESOLVE "Instead of runtime symbol resolving, sore offset and sharedlib path for offline symbol resolution" OFF)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "sore" the right word here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry it was a typo, fixed

@@ -401,16 +468,29 @@ CallstackSymbolData DecodeSymbolAddress( uint64_t ptr )

CallstackEntryData DecodeCallstackPtr( uint64_t ptr )
{
InitRpmalloc();

const ModuleNameAndBaseAddress moduleNameAndAddress = GetModuleNameAndPrepareSymbols( ptr );
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is outside dbghelp lock.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the locking up to include it.
Maybe it would be better to have a scoped wrapper for this lock though.

@@ -938,20 +1039,42 @@ void SymInfoError( void* /*data*/, const char* /*msg*/, int /*errnum*/ )
cb_data[cb_num-1].symAddr = 0;
}

void getSymbolForOfflineResolve(void* address, Dl_info& dlinfo, CallstackEntry& cbEntry)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalize function names.

symloc = dlinfo.dli_fname;
}

if(cb_bts && !s_shouldResolveSymbolsOffline)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a case where both these checks are necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no there shouldn't be, fixed it

{ "compression", required_argument, NULL, 'c' },
{ "compressesionLevel", required_argument, NULL, 'l' },
{ NULL, 0, NULL, 0 }
};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you unify processing of compression parameters with the update utility?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I have to admit I didn't realize the update tool already did that. Maybe I should actually move the symbol resolution to it instead of creating the tracy-edit tool then?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may make sense to do that. Try to keep the new functionality in a separate source file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

uint64_t callstackFrameCount = worker.GetCallstackFrameCount();
std::string relativeSoNameMatch = "[unresolved]";

std::cout << "Found '" << callstackFrameCount << "' callstack frames. Batching into image groups..." << std::endl;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the count supposed to be quoted here?

Copy link
Author

@tiago-rodrigues tiago-rodrigues Nov 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed all the numeric quoting in there

@wolfpld
Copy link
Owner

wolfpld commented Nov 25, 2023

I did a quick test and the resolved symbols don't show up in the symbol statistics list. Is this something you will be working on? If not, I think the documentation should state that there are functional differences.

@tiago-rodrigues
Copy link
Author

I did a quick test and the resolved symbols don't show up in the symbol statistics list. Is this something you will be working on? If not, I think the documentation should state that there are functional differences.

Right, I forgot to tackle that one. I was mainly using the memory view for the workflow I'm working right now.
I've updated the documentation to reflect this limitation.

I'll try to do another PR after where I'll attempt to fixup the symbol map for the newly resolved symbols. However it might be a bit more involved if I ended up avoiding calling dladdr (which I'll have another PR as well in order to reduce the contention mentioned by YalTeR : #665 (comment)).

@wolfpld wolfpld merged commit af73dba into wolfpld:master Nov 27, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants