Stream: t-compiler/wg-llvm

Topic: DWARF .debug_line column and utf-8


bjorn3 (Dec 16 2019 at 20:23, on Zulip):

While trying to optimize the .debug_line generation of cg_clif, I noticed that cg_llvm uses the CharPos as column number in .debug_line column field. I tried to find if this is the correct value, or that a BytePos should be used instead, but the DWARF specification doesn't seem to specify which one is the correct one.

bjorn3 (Dec 16 2019 at 20:23, on Zulip):

If it is the BytePos, that could speed up .debug_line generation in cg_clif and maybe cg_llvm by a few percent, as there is no need to conversion anymore.

cuviper (Dec 16 2019 at 20:44, on Zulip):

good question!

cuviper (Dec 16 2019 at 20:45, on Zulip):

in the absence of an answer in the standard, which I don't see either, I'd look at what other producers like clang do, or how consumers like gdb apply it

bjorn3 (Dec 16 2019 at 21:00, on Zulip):

Clang

CGDebugInfo::getColumnNumber calls SourceManager::getPresumedLoc to get the PresumedLoc which contains the column number. SourceManager::getPresumedLoc calls SourceManager::getColumnNumber which seems to use a byte position and not a character position.

bjorn3 (Dec 16 2019 at 21:25, on Zulip):

GCC

Has a 21725 lines long parser file for C.

add_debug_begin_stmt is often called with a location_t from c_parser_peek_token (parser)->location which takes it from c_lex_one_token which in turn takes it from c_lex_with_flags which I think takes it from cpp_get_token_with_location which tail calls cpp_token_get_1 which calls _cpp_lex_token which calls _cpp_lex_direct which finally uses the CPP_BUF_COLUMN macro, which uses a simple substraction of pointers, so it uses byte position too.

bjorn3 (Dec 16 2019 at 21:25, on Zulip):

I will fill an issue to change cg_llvm to use byte positions too.

bjorn3 (Dec 16 2019 at 21:29, on Zulip):

https://github.com/rust-lang/rust/issues/67360

Last update: Jan 28 2020 at 01:45UTC