991102.1 A A D. Anderson 64-bit Modify field definitions for 64-bit arch

**Exiting Practice

Cygnus has followed the existing spec, and put
all 32bit offsets and lengths in to dwarf
for 64 and 32 bit pointer targets
Regardless if elf32 or elf64
(or some other object format).

SGI has made all offsets and lengths specified as
32-bit in the dwarf spec to go with the ABI:
a) for 32bit pointer objects, 32 bits, solely in elf32 objects.
b) for 64bit pointer objects, 64bit, solely in elf64 objects.

Others?

**The Issues

64bit offsets and/or lengths are likely to be needed
in a small number of cases.

Making the determination based on information such
as elf header information (as SGI does) makes
DWARF2 dependent on information out side the spec.
And it is not consistent with allowing
both 32 and 64bit offset compiliation-unit
debug data to be transparently linked
into one object: the linker would have to expand
the 32 bit offsets or truncate the 64 bit stuff
for the result to be interpretable.

**Proposal

This proposal allows free mixing of 32 and 64bit dwarf.
Compiler providers can emit full 64bit uniformly
or can use 32bit. The offset/length size is
per compilation unit and
independent of the pointer size of the object/executable.

****Key Observation

Every section with compilation unit data with
fixed-length fields that limit size has, per
compilation unit at the start of
the data for each CU, either

        length version
or
        length id version

****Proposal details

In each compilation unit, the initial 'length' field
is the distinguised value 0xffffffff if and only if
the compilation unit uses 64 bit offsets and lengths.

Following the 32bit 0xffffffff value is the
complete compilation unit data, with lengths and
offsets being 64 bits.

Note that DW_FORM_REFn is self-identifying for
any n, so
the distinguished value has no impact whatever on such fields.

The version number does not change.
In effect this leaves version to identify some
other substantive change. The distinguished value
determines offset/length sizes.

****Distinguished Value

Rather than just one distinguished value,
the proposal is that values > 0xffffff00
be considered 'distinguished values'
and that 0xffffffff be reserved for the 64bit
reservation..

The other values in that range are reserved to
the Dwarf specification, and are not legal
lengths.

(this is no practical limitation)

****Per CU consequences spelled out

Ref
Doc
Page section
---- --------------
p66 .debug_info
        Per compilation unit, if the length field is the
        distinguished value 0xffffffff then
        the following 64 bits are the true length field
        followed by the version number and the 64-bit
        offset into the .debug_abbrev section.

p61 .debug_frame
        Per CIE, if the length field is the
        distinguished value 0xffffffff then
        the following value is the 64-bit true length.
        Followed by the 64bit reserved value
        (which is what indicates this is a CIE, not FDE).

        [.debug_frame is very processor-specific.
          SGI chose the CIE reserved value to be
          32 (or 64 for 64bit objects) bits all bits on.

          For .eh_frame
          (similar but not quite identical), cygnus chose 0

          This distinguished value is not part
          of this proposal!
        ]

        Per FDE, if the length field is the
        distinguished value 0xffffffff then
        the following value is the 64-bit offset to the CIE.

        This does have the effect that the 'length'
        field is no longer guaranteed alignment,
        though the initial 32 bits is guaranteed
        alignment to the address-size.

p77 .debug_pubnames
        Per compilation unit if the length field is the
        distinguished value 0xffffffff then
        the following value is the 64bit true length
        followed by the 16 bit version and the 64 bit
        offset to the .debug_info section and
        the 64bit length of pairs.
        Each pair has a 64bit offset, and the terminating
        value (the offset with no string) is a 64bit zero value.

p77 .debug_aranges
        Per compilation unit if the length field is the
        distinguished value 0xffffffff then
        the following value is the 64bit true length
        followed by the 16 bit version and the 64 bit
        offset to the .debug_info section.

p52
p77 .debug_line
        Per compilation unit if the length field is the
        distinguished value 0xffffffff then
        the following value is the 64bit true length
        followed by the 16 bit version and the 64 bit
        prolog length.

        In other words, given the distinguished value,
        the 'uword' and 'sword' are 64 bits.

**Other approaches considered (discarded)

****Using version 3.

Given a distinguished initial length, the following
16 bits is the version number, and that version
number is 3. Following that is
a fully-64-bit length and (where applicable) offset.

There is one more than one way of arranging
this, as one could imagine simply redoing the
info with 64 bit data, as in
64bit length, version, offset-to-debug_info
(so the version number gets repeated!)

Or changing it so that after the 3 the second version
is suppressed.

Either the version gets duplicated or a
not, but if version is not duplicated then
the following info is really like the real proposal,
but with a version number change (with no real information
content).

It is not clear to me either variant is an improvement
over the proposal above.

****Each field individually extendable.

At each field (length, offset, etc) for which there
is a 32bit value, if that individual value is to be
extended the 32 bits has the distinguished value
and the following 64bits is the true value.

This is very simple conceptually.

This forces every 64bit value to be 96 bits really.

I do not consider the space for all those distinguished
values to be a good idea.

**Endianness
There is no endianness issue in this proposal.
Like the base spec, the endianness of lengths
and offsets is determined by the cpu or other
information outside of DWARF2 itself (however
the Version number 2 has detectable endianness,
so perhaps endianness best thought of
as *being* detectable in the DWARF2 data).


It was decided that this 64-bit format was reasonably compatible with 32-bit Dwarf2.

A compiler would need to be told to generate 64-bit Dwarf data. A smart linker might be able to combine 32-bit Dwarf with 64-bit Dwarf, but lacking this, all Dwarf files would need to be the same type. An existing tool which did not handle 64-bit Dwarf would likely fail upon reading a 64-bit Dwarf file, rather than provide spurious results.

This will require a number of editorial changes to the document.  See 000403.2 and 000410.4.