000531.2 A R. Brender Representation Add DW_AT_entry_pc attribute

SUMMARY
-------

Add attribute DW_AT_entry_pc, to allow specification of an "entry address"
in those cases where either

  - it is different than DW_AT_low_pc, or
  - no DW_AT_low_pc attribute is present.


PROPOSAL
--------

  - In V2/section 2.10 (2.11 in Draft 3) include mention of DW_AT_entry_pc
    like DW_AT_low_pc and DW_AT_high_pc, a la:

        "If the entry containing the DW_AT_segment attribute has a
        DW_AT_low_pc, DW_AT_high_pc or DW_AT_entry_pc attribute, or a
        -----------------
        location description that evaluates to an address, then those
        values represent the offset portion of the address within the
        segment specified by DW_AT_segment."

  - In V2/section 3.2/page 25, add the following sentence after the paragragh
    the begins "If the module contains initialization code...":

        "It [the module entry] may also have a DW_AT_entry_pc attribute whose
        value is the address of the first executable instruction of that
        initialization code. If a DW_AT_entry_pc attribute is not present,
        then the first executable instruction is assumed to occur at
        the DW_AT_low_pc address."

  - In V2/section 3.3.2/page 26, add the following new paragraph as the new
    second paragraph (following the description of DW_AT_low_pc and _high_pc):

        "A subroutine entry may also have a DW_AT_entry_pc whose value is
        the address of the first executable instruction of the subroutine.
        If a DW_AT_entry_pc attribute is not present, then the first executable
        instruction is assumed to occur at the DW_AT_low_pc address."

    In this same section, add the following new italics paragraph after the
    paragraph that reads "An entry point has a DW_AT_low_pc attribute whose
    value is the relocated address of the first machine instruction generated
    for the entry point":

        <i>Note that there is no need for an entry to have a DW_AT_entry_pc
        attribute because the DW_AT_low_pc attribute can (and should) be
        specified as the first <u>executable</u> instruction for the entry
        point.</i>

  - In V2/section 3.3.8.1/page 29, add DW_AT_entry_pc to the list of attributes
    given in the second paragraph (attributes that do not occur in an abstract
    instance).

  - In V2/section 3.3.8.2/page 29, add the following sentence to the end of the
    second paragraph:

        "An inlined subroutine entry may also contain a DW_AT_entry_pc
        attribute, representing the address of the first executable
        instruction of the inline expansion; if no DW_AT_entry_pc attribute
        is present, then DW_AT_low_pc represents the first executable
        instruction."

    Further in this same section, add DW_AT_entry_pc following DW_AT_high_pc
    in the list of attributes that can occur in a concrete inlined instance.

  - In V2/Figure 18/section 7.5.4/page 71, add

        DW_AT_entry_pc <next code> address

  - In Appendix 1, add DW_AT_entry_pc as an applicable attribute to

        DW_TAG_module
        DW_TAG_subprogram
        DW_TAG_inlined_subprogram


DISCUSSION
----------

It appears that DWARF V2 never actually specifies that the first *executable*
address of a subroutine (or module initialization) is the same as the low pc
of that routine. This implicit assumption is corrected by this proposal.

However, it is not necessarily the case that the first executable instruction
is the same as the instruction with the lowest pc. One reason for this
derives from optimized code: If the scope of an entity cannot be represented
as a single contiguous range, then low pc and high pc are not supposed to be
used--but there is still a need to specify where execution begins.

Another reason can occur as follows: Once upon a time we (Digital/Compaq)
had some neat instruction block ordering heuristics that attempted to minimize
the need for branch instructions by arranging for one block to "fall into"
the next where possible. This lead to one of the prettiest pieces of compiled
code you ever wanted to see--it looked basically like this:

    1$:         update link list pointer in Rn
                ...
    SEARCH::    initial/head pointer passed in Rn
                test for empty list
                return null if empty
                test for sucessful match
                branch to 1$ if failure
                return pointer to matching entry
    2$:

For this routine, the entry point is in the middle of the loop! (Of course,
the routine was simple enough to not require any stack context). Since the
code is contiguous, low pc corresponds to 1$ and high pc corresponds to 2$.
But 1$ is not the entry point!

The sad part is we actually had to "de-optimize" the code (by adding a branch
instruction just before 1$ that could serve as both the low pc and the entry
point)--because neither the system symbol tables not exception handling
tables allowed us to separate the two roles (low address vs first executable
address.)

The proposed DW_AT_entry_pc attribute deals with both of these problems.

Note that DW_AT_entry_pc does not appear to be applicable to or needed for:

  - a compilation unit entry (described in section 3.1/page 23).
  - a lexical block (V2/section 3.4/p31)
  - a label (V2/section 3.5/p31-32)
  - a with statement entry (V2/section 3.6/p32)
  - a catch block entry (V2/section 3.7/p32)

[I suppose that one *could* claim that DW_AT_entry_pc might be usable in
a lexical block, with statement or catch block (all three are lexical blocks,
actually) -- but I think this is rare and can be left for a vendor "extension"
if appropriate.]

For this same reason, it does not appear to be important to add a summary
description of DW_AT_entry_pc at the end of section 2 (as was done for
DW_AT_low_pc and DW_AT_high_pc). [Although I could be persuaded otherwise
is someone feels stronly...]


Approved.