Issue 991108.11

991108.11

D. Anderson

Fortran

Fortran90 arrays

Problem: complex allocatable f90 arrays

No proposal included here.
This is just a problem description with one
solution sketched briefly.

Consider the following f90 code.

    type array_ptr
        real :: myvar
        real, dimension (:), pointer :: ap
    end type array_ptr

type (array_ptr), allocatable, dimension (:) :: arrays

    allocate (arrays(20))
    do i = 1,20
        allocate (arrays(i)%ap(i))
    end do

arrays is an allocatable array (1 dimension) whose size is
not known at compile time (it has
a Dope Vector). At run time, the
allocate statement creats 20 array_ptr dope vectors
and marks the base arrays dopevector as allocated.
The myvar variable is just there to add complexity to
the example :-)

In the loop, arrays(1)%ap(1)
    is allocated as a single element array of reals.
In the loop, arrays(2)%ap(2)
    is allocated as an array of two reals.
...
In the loop, arrays(20)%ap(20)
    is allocated as an array of twenty reals.

The problem is that there is no known way
to know the array bounds of
arrays(2)%ap
for example.

SGI solution:
DW_AT_MIPS_ptr_dopetype
DW_AT_MIPS_allocatable_dopetype
DW_AT_MIPS_assumed_shape_dopetype

DW_AT_MIPS_assumed_shape_dopetype, DW_AT_MIPS_allocatable_dopetype,
and DW_AT_MIPS_ptr_dopetype have an attribute value
which is a reference to a Fortran 90 Dope Vector.
These attributes are introduced in MIPSpro7.3.
They only apply to f90 arrays (where they are
needed to describe arrays never properly described
before in debug information).
C, C++, f77, and most f90 arrays continue to be described
in standard dwarf.

The distinction between these three attributes is the f90 syntax
distinction: keywords 'pointer' and 'allocatable' with the absence
of these keywords on an assumed shape array being the third case.

A "Dope Vector" is a struct (C struct) which describes
a dynamically-allocatable array.
In objects with full debugging the C struct will be
in the dwarf information (of the f90 object, represented like C).
A debugger will use the link to find the main struct DopeVector
and will use that information to decode the dope vector.
At the outer allocatable/assumed-shape/pointer
the DW_AT_location points at the dope vector (so debugger
calculations use that as a base).

This is an overly simplified version of a dope vector,
presented as an initial hint.

struct simplified{
    void *base; // pointer to the data this describes
    long el_len;
    int assoc:1
    int ptr_alloc:1
    int num_dims:3;
    struct dims_s {
        long lb;
        long ext;
        long str_m;
    } dims[7];
};

Fundamentally, we build two distinct
representations of the arrays and pointers.
One, in dwarf, represents the statically-representable
information (the types and
variable/type-names, without type size information).
The other, using dope vectors in memory, represents
the run-time data of sizes.
A debugger must process the two representations
in parallel (and merge them) to deal with user expressions in
a debugger.

In dwarf, there is no way to find the array bounds of arrays(3)%ap,
for example, (which are 1:3 in f90 syntax)
since any location expression in an ap array lower bound
attribute cannot involve the 3 (the 3 is known at debug time and
does not appear in the running binary, so no way for the
location expression to get to it).
And of course the 3 must actually index across the array of
dope vectors in 'arrays' in our implementation, but that is less of
a problem than the problem with the '3' (given well
defined dope vectors).
One approach here would be like that for structs, presume
that one or more debugger known values are 'pushed'
on to the location description value stack before evaluating
the expression. Such has not been implemented.

Plus dwarf has no way to find the 'allocated' flag in the
dope vector (so the debugger can know when the allocate is done
for a particular arrays(j)%ap).

In current implementations the calculations for the bounds
of arrays(3)%ap are left entirely to the debugger, based from
the dope vector of arrays.