Stream: t-compiler/wg-incr-comp

Topic: #34651 split dwarf progress updates


view this post on Zulip davidtwco (Sep 22 2020 at 16:42):

So, a brief update as to my progress after this morning's meeting:

The code needs some cleaning up, but so far the changes aren't all that invasive. The example below contains a dwo file path that would be per-codegen-unit, I don't know if that's going to be the case in the end, it's just the strings that I had to hand.

 <0><9b>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <9c>   DW_AT_stmt_list   : 0xe8
    <a0>   DW_AT_comp_dir    : (indirect string, offset: 0x13b): /home/david/Projects/rust/rust0
    <a4>   DW_AT_GNU_dwo_name: (indirect string, offset: 0x15b): foo.foo.7rcbfp3g-cgu.0.rcgu.dwo
    <a8>   DW_AT_GNU_dwo_id  : 0x357472a2b032d7b9
    <b0>   DW_AT_low_pc      : 0x0
    <b8>   DW_AT_ranges      : 0x40
    <bc>   DW_AT_GNU_addr_base: 0x0

Once I've got it emitting the dwo files correctly, I'll start looking into platform checks and things like that.

view this post on Zulip davidtwco (Sep 23 2020 at 14:30):

Another update:
I now have split dwarf working - still needs some tidying up but compiling with -Zsplit-dwarf=split will output a .dwo file per codegen-unit which is referenced by the DWARF in the binary. If I run gdb then it appears to load them and if I move them then gdb complains.

view this post on Zulip davidtwco (Sep 23 2020 at 14:32):

I want to clean up the code a little bit, make some changes to the flag I've added, check for appropriate targets (unsure what LLVM does when I pass it this on Windows for example), figure out whether to write to a single dwo file rather than a per-codegen-unit file or if I can link them together or something like that, and check that the single mode works.

view this post on Zulip davidtwco (Sep 23 2020 at 14:32):

Oh, and I need to figure out how to write tests for this.

view this post on Zulip bjorn3 (Sep 23 2020 at 14:34):

Should -Zsplit-dwarf=split be equivalent to -Zrun-dsymutil=no on macOS?

view this post on Zulip davidtwco (Sep 23 2020 at 14:34):

I'm not familiar with -Zrun-dsymutil=no on macOS.

view this post on Zulip bjorn3 (Sep 23 2020 at 14:37):

On macOS the linker doesn't add the debuginfo to the generated executable. Instead it adds a section that specifies which part of which object file ended up where in the executable. dsymutil is then run to take all the debuginfo for used functions and rewrite it into a .dSYM file. If -Zrun-dsymutil=no is used, no .dSYM file is generated. Instead the temporary object files are kept to make debuggers still able to get the debuginfo.

view this post on Zulip davidtwco (Sep 23 2020 at 14:47):

It's similar, I think:

My understanding is that Split DWARF partitions the debuginfo sections into those that require link-time relocation and those which don't. Those which don't are typically larger. The debuginfo that doesn't require link-time relocation is processed by the linker and that wastes time and memory under normal circumstances, but Split DWARF makes it so that debuginfo won't be seen by the linker. There are two ways it can do that - clang calls them split and single ("kinds of dwarf fission", which comes from the name of the original project to do this in gcc land).

Split fission creates DWO (dwarf object) files containing the debuginfo that doesn't require link-time relocation and the linker doesn't look at them at all; the objects contain DW_AT_GNU_dwo_name and DW_AT_GNU_dwo_id DWARF attributes which have a path to the file (it's relative currently). Those attributes change if LLVM thinks we're doing DWARF 5 but it all works the same as far as I can tell.

Single fission still writes the debuginfo to the relocatable object but in such a way that its ignored by the linker - I don't know more about it than that.

So, compared with -Zrun-dsymutil:-Zsplit-dwarf=split will put debuginfo in a separate file, but whether or not that's one file or many depends on how I implement this - currently it outputs a dwo file per-codegen-unit - I suspect that'll change and I'll just output a foo.dwo alongside the foo binary, and using save-temps might keep the original per-codegen-unit files but I don't know exactly yet, not looked into how to do any of that part yet.

view this post on Zulip davidtwco (Sep 23 2020 at 14:48):

Does that make sense?

view this post on Zulip davidtwco (Sep 23 2020 at 17:08):

Opened draft PR at #77117 with what I've got so far.

view this post on Zulip davidtwco (Oct 14 2020 at 18:24):

Updated the PR today to resolve the linking issue that I described at our last issue, turns out that LLVM has a tool for doing what I needed which I wasn't aware of.

view this post on Zulip davidtwco (Jan 14 2021 at 21:13):

cc #t-compiler > split dwarf and dependencies


Last updated: Oct 21 2021 at 21:20 UTC