991026.3 | B | I | J. Merrill | Compression | Duplicate Dwarf data deletion |
The scheme we sketched out yesterday will allow us to remove
duplicates
from the .debug_info section, which is the most important (modulo
.debug_abbrev handling, as discussed below). However, there
can be duplication in other dwarf2 sections, as well:
.debug_abbrev: Some entries may only be used by discarded info.
.debug_str, .debug_loc: Likewise.
.debug_line: Line info for discarded COMDATs should also be discarded.
.debug_frame: Likewise for unwind info.
.debug_pubnames, .debug_aranges: Likewise.
.debug_macinfo: Also subject to duplication.
_abbrev is tricky because the abbrevs need to be numbered. This means that
we must define a certain set of abbrevs ahead of time, and all the
.debug_info bits to be commonized can only use those abbrevs. This
significantly complicates the process of reducing .debug_info.
_line is tricky since the header contains a list of filenames that
will be referenced later. This also affects .debug_info, since
DW_AT_decl_file (if used) refers to the same header.
_pubnames and _aranges are tricky because the header refers to the
length of the pubname/arange set, which would require link-time
calculation. This is also true of .debug_info.
On the other hand, it would be pretty straightforward to generate an
additional CU within the object file. It could use the _abbrev and _line
info from the main CU. The minimum overhead (for a 32-bit target) would be:
11 bytes for the CU header
1 byte for the CU DIE TAG
1 byte for AT_language
4 bytes for a pointer into .debug_str for AT_producer
4 bytes for AT_stmt_list (maybe)
--
21 bytes
A possible extension would be AT_extension for TAG_compile_unit, so the
secondary CU would be only 1+4 bytes, bringing the total to 16 bytes.
_str, _loc, and _frame can all be broken up easily; the chunks we're
interested in don't need headers or depend on other information.
_macinfo is tricky because it is linear. Breaking it into chunks
would require some sort of extension -- perhaps a symbolic reference to
the macro information for a particular header, to be used instead of
MACINFO_{start,end}_file.
This proposal has been revised and is replaced by 010219.1.