Skip to content

macho: fix DWARF in dSYM and sym naming more consistent #8784

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 16, 2021
Merged

Conversation

kubkon
Copy link
Member

@kubkon kubkon commented May 15, 2021

  • Advance line and PC prior to ending sequence in debug line program for a fn_decl. This is equivalent to closing scope in the debugger and without it, the debugger will not map source-to-address info as a result will not print the source when breaking at a symbol.
  • Fix debug aranges sentinels to be of the size as the actual tuple descriptor (assuming segment selector to be ommitted). In summary, the sentinels were 32bit 0s, whereas they ought to be 64bit 0s.
  • Make naming of symbols in the binary more consistent by prefixing each symbol name with an underscore '_'.

With these changes, here's what we get in lldb when debugging a Zig program compiled with self-hosted:

$ zig-out/bin/zig build-exe hello.zig
$ ls
hello     hello.zig zig-cache
$ lldb hello
(lldb) target create "hello"
Current executable set to '/Users/kubkon/dev/zig/examples/stage2/hello' (arm64).
(lldb) b main
Breakpoint 1: where = hello`main at hello.zig:4, address = 0x0000000100001000
(lldb) r
Process 54300 launched: '/Users/kubkon/dev/zig/examples/stage2/hello' (arm64)
Process 54300 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100001000 hello`main at hello.zig:4
   1   	extern "c" fn write(usize, usize, usize) usize;
   2   	extern "c" fn exit(usize) noreturn;
   3
-> 4   	export fn main() noreturn {
   5   	    print();
   6
   7   	    exit(0);
Target 0: (hello) stopped.
(lldb)

Note that we successfully print the source file info and the binary itself is devoid of any old-fashioned stab symbols and DWARF info. Instead, we make use of the modern alternative on macOS called the dSYM bundle which is a mirror image of the binary's in-memory layout with relocated DWARF info attached. The beautiful thing about the dSYM bundle is that it doesn't have to reside side-by-side the binary since it is the matching UUID between the binary and dSYM that identifies the pairing, and lldb is smart enough to ask Spotlight service to locate the dSYM from the global macOS cache. The tl;dr here is though, Zig's self-hosted is supercharged as we not only provide an incremental (and traditional!) MachO linkers, but now also implement something akin to an in-built dsymutil for debugging symbols management.

* Advance line and PC prior to ending sequence in debug line program
  for a fn_decl. This is equivalent to closing scope in the debugger
  and without it, the debugger will not map source-to-address info
  as a result will not print the source when breaking at a symbol.
* Fix debug aranges sentinels to be of the size as the actual tuple
  descriptor (assuming segment selector to be ommitted). In summary,
  the sentinels were 32bit 0s, whereas they ought to be 64bit 0s.
* Make naming of symbols in the binary more consistent by prefixing
  each symbol name with an underscore '_'.
@kubkon kubkon added os-macos frontend Tokenization, parsing, AstGen, Sema, and Liveness. backend-self-hosted labels May 15, 2021
@daurnimator
Copy link
Contributor

The beautiful thing about the dSYM bundle is that it doesn't have to reside side-by-side the binary since it is the matching UUID between the binary and dSYM that identifies the pairing, and lldb is smart enough to ask Spotlight service to locate the dSYM from the global macOS cache

FYI to support that on other platforms we need build-id support (#3047)

@kubkon
Copy link
Member Author

kubkon commented May 15, 2021

The beautiful thing about the dSYM bundle is that it doesn't have to reside side-by-side the binary since it is the matching UUID between the binary and dSYM that identifies the pairing, and lldb is smart enough to ask Spotlight service to locate the dSYM from the global macOS cache

FYI to support that on other platforms we need build-id support (#3047)

I've heard somewhere that dSYM will make it into the DWARF standard at some point, which is all the more reason to have build-id support on all platforms.

@kubkon kubkon merged commit 6461b95 into master May 16, 2021
@kubkon kubkon deleted the macho-dsym branch May 16, 2021 06:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend-self-hosted frontend Tokenization, parsing, AstGen, Sema, and Liveness. os-macos
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants