Fortran DVM - contents Part 1 (1-4) Part 2 (5-6) Part 3 (7-12) Part 4 (Annexes)
created: april 2001 - last edited 06.10.02 -

1 Introduction

1.1 Parallel programming models

Three parallel programming models are now prevalent in large, scalable systems (see Fig. 1.1): message-passing model (MPM), shared-memory model (SMM) and data parallel model (DPM).

Message-passing model. In message passing model each process has its own local address space. Common data processing and synchronization are performed by message passing. Summarizing and standardization of different message-passing libraries resulted in MPI [1] standard development.

Shared-memory model. In shared memory model processes share common address space. Because there are no limitations to common data usage a programmer must explicitly specify common data and regulate an access to the data using synchronization tools. In high level languages logically independent threads are defined at the level of functional tasks or at level of loop iterations. Summarizing and standardization of shared memory models resulted in OpenMP [2] standard development.

Data parallel model. In data parallel model a process notion is absent and, as result, explicit message passing or explicit synchronization are absent. In the model data are distributed on nodes (processors) of computing system. Sequential program is translated by a compiler either in message passing model program or shared memory model program (Fig. 1.1). The computations are distributed according to the owner-computes rule: each processor performs only the computations of own data, that is the data, allocated on the processor.

In comparison with two previous models DPM has obvious advantages. A programmer is freed from tedious efforts of distributing the global array onto local arrays, explicit management of sending and receiving message, or explicit synchronization of parallel processes. But application area of the model is the object of research. The research results show that a performance of many algorithms of scientific computing in DPM model is comparable with the performance of the realization in MPM and SMM models.

HPF1 [3] development was the first attempt of DPM standardization. MPM and SMM model standardization is done on the base of large experience of implementations and practical applications summarizing. HPF1 standard was developed on the base of theoretical researches and 2-3 experimental implementations. Moreover, the standard was based on automatic parallelization of calculation and automatic synchronization of shared data access. First HPF1 implementations showed inefficiency of the standard for modern methods of calculations (in particular, for irregular calculations). In the next version of the standard HPF2 [2] a step was done to "manual" control of parallel execution performance. In particular, the distribution of computations and common reduction variable specification were defined.

Fig. 1.1. Three models of parallel programming

1.2 DVM-approach to parallel program development

DVM-system provides unified toolkit to develop parallel programs of scientific-technical calculations in C and Fortran 77.

DVM parallel model. DVM parallel model is based on data parallel model. The DVM name reflects two names of the model - Distributed Virtual Memory and Distributed Virtual Machine. These two names show that DVM model is adopted both for shared memory systems and for distributed memory systems. DVM high level model allows not only to decrease cost of parallel program development but provides unified formalized base for supporting Run-Time System, debugging, performance analyzing and prediction.

Languages and compilers. As distinct from HPF the goal of full automation of computation parallelization and the common data access synchronization is not assumed in DVM system. Using high level specifications, a programmer has full control of parallel program performance. On the other hand, in the process of Fortran DVM design and development the compatibility with the subsets of HPF1 and HPF2 standards was kept.

Unified parallel model is built in C and Fortran 77 language on the base of the constructions, that are "transparent" for standard compilers, that allows to have single version of the program for sequential and parallel execution. C-DVM and Fortran DVM compilers translates DVM-program in C or Fortran 77 program correspondingly, including parallel execution Run-Time Support system calls. So only requirement to a parallel system is availability of C and Fortran 77 compilers.

Execution and debugging technique. Unified parallel model allows to have unified Run-Time Support system for both languages and, as result, unified system of debugging, performance analyzing and prediction. There are following modes of DVM-program execution and debugging:

Following debugging modes are provided in pseudo-parallel and parallel execution modes:

2 Language overview

2.1 Programming model and model of parallelism

Fortran DVM language is the extension of Fortran 77 language [2]. The extension is implemented via special comments, named directives. FDVM directives may be conditionally divided on three subsets:

FDVM parallel model is based on specific form of data parallelism called SPMD (Single Program, Multiple Data). In this model the same program is executed by all the processors concerned, but each processor performs its own subset of statements in accordance with the data distribution.

First, in FDVM model a user defines multidimensional arrangement of virtual processors, which sections data and computations will be mapped on. The section can be varied from the whole processor arrangement up to a single processor.

Then the arrays to be distributed over processors (distributed data) are determined. These arrays are specified by data mapping directives (section 4). The other variables (distributed by defaults) are mapped by one copy per each processor (replicated data). A value of replicated variable must be the same one on all the processors concerned. Single exception is the variables in parallel constructions (see sections 5.1.3, 5.1.4 and 7.5).

FDVM model defines two parallelism levels:

Data parallelism is implemented by distribution of tightly nested loops over the processors (section 5). Each iteration of such loop is executed on one processor entirely. The statements located outside parallel loop are executed according own computation rules (section 5.2).

Task parallelism is implemented by distribution of data and independent computations over sections of processor arrangement (section 7).

When calculating the value of own variable, the processor may need in values of as own as other (remote) variables. All remote variables must be specified in remote data access directives (section 6).

2.2 Syntax of FDVM directives

The syntax of FDVM directives is described using a Backus-Naur form and the following notations:

is is by definition
or an alternative construct
[ ] encloses optional construct
[ ]… encloses an optionally repeated construct which may occur zero or more times
x-list x [ , x ]…

Syntax of the directive.

directive-line is      CDVM$ dvm-directive
  or    *DVM$ dvm-directive
   
dvm-directive is      specification-directive
  or     executable-directive
   
specification-directive is       processors-directive
  or     align-directive
  or     distribute-directive
  or     template-directive
  or     pointer-directive
  or     shadow-directive
  or     dynamic-directive
  or     inherit-directive
  or     remote-group-directive
  or     reduction-group-directive
  or     task-directive
  or      heap-directive
  or      asyncid-directive
   
executable-directive is      realign-directive
  or     redistribute-directive
  or     parallel-directive
  or     remote-access-directive
  or     shadow-group-directive
  or     shadow-start-directive
  or     shadow-wait-directive
  or     reduction-start-directive
  or     reduction-wait-directive
  or     new-value-directive
  or     prefetch-directive
  or     reset-directive
  or     parallel-task-loop-directive
  or     map-directive
  or     task-region-directive
  or     end-task-region-directive
  or     on-directive
  or     end-on-directive
  or     f90-directive
  or     asynchronous-directive
  or     end-asynchronous-directive
  or     asyncwait-directive

Constraints:

No statements may be interspersed within a continued directive. A directive line must not appear within a continued statement. An example of a directive continuation follows. Note that column 6 must be blank, except when signifying continuation.

CDVM$  ALIGN SPACE1(I,J,K)
CDVM$* WITH SPACE(J,K,I)

3 Virtual processor arrangements. PROCESSORS directive

The PROCESSORS directive declares one or more rectangular virtual processor arrangements.

Syntax.

processors-directive is      PROCESSORS processors-decl-list
   
processors-decl is      processors-name ( explicit-shape-spec-list )
explicit-shape-spec is      [ lower-bound : ] upper-bound
   
lower-bound is      int-expr
   
upper-bound is      int-expr
   

The intrinsic function NUMBER_OF_PROCESSORS( ) can be used to determine the number of real processors, provided to a program.

It is possible to use several virtual processor arrangements of different shape if the number of processors in every arrangement is equal to the value of function NUMBER_OF_PROCESSORS( ). If two virtual processor arrangements have the same shape, then corresponding elements of the arrangements are referred to the same virtual processor.

Example 3.1. Declaration of virtual processor arrangements.

CDVM$	PROCESSORS  P( N )
CDVM$	PROCESSORS  Q( NUMBER_OF_PROCESSORS( ) ), 
CDVM$*	            R(2, NUMBER_OF_PROCESSORS( )/2)

The value N has to be equal to the value of the function NUMBER_OF_PROCESSORS ( ).

The processor arrangements are local objects of the procedure. Data arrays with COMMON and SAVE attributes can be mapped on the local processor arrangements, if whenever the procedure is called, the local processor arrangement has the same shape.

4 Data mapping

FDVM supports distribution by blocks (equal and non-equal), inherited distribution, dynamic array distribution and distribution via alignment.

4.1 DISTRIBUTE and REDISTRIBUTE directives

Syntax.

distribute-directive is      dist-action distributee dist-directive-stuff
  or     dist-action [ dist-directive-stuff ] :: distributee-list
   
dist-action is     DISTRIBUTE
  or    REDISTRIBUTE
   
dist-directive-stuff is      dist-format-list [ dist-onto-clause ]
   
distributee is      array-name
   
dist-format is      BLOCK
  or     GEN_BLOCK ( block-size-array )
  or     WGT_BLOCK ( block-weight-array , nblock )
  or     *
   
dist-onto-clause is      ONTO dist-target
   
dist-target is      processors-name
       
[( processors-section-subscript-list )]
   
processors-section-subscript is      [ subscript ] : [ subscript ]
   
subscript is      int-expr
nblock is      int-expr
   
block-size-array is      array-name
block-weight-array is      array-name

Constraints:

The ONTO clause specifies the virtual processor arrangement or its section. If ONTO clause is omitted, than array distribution is performed by base virtual processor arrangement, that is a parameter of program startup. When REDISTRIBUTE directive without ONTO clause is executed in ON-block, the array is distributed on the section of processor arrangement of this ON-block (see section 7).

Several arrays (A1, A2,…) can be distributed at the same mode by the single directive of the form:

CDVM$	DISTRIBUTE dist-directive-stuff :: A1, A2, …

In that case the arrays must have the same rank, but can have different sizes of dimensions.

Let us consider distribution formats for one dimension of the array (one-dimensional array A(N)) and for one dimension of the processor arrangement (one-dimensional array R(P)). Multi-dimensional distributions are considered in section 4.1.5.

4.1.1 BLOCK format

A block of [(N-1)/P]  +1 elements are allocated on each processor. It is possible for some ratio between N and P that several last processors do not contain any the array elements.

Example 4.1. Distribution by BLOCK format.

    A   B   C
  R(1) 1   1   1
    2   2   2
CDVM$   PROCESSORS R( 4 )   3   3   3
        4    
             
                 REAL A (12), B(13), C(11) R(2) 4   5   4
    5   6   5
    6   7   6
CDVM$  DISTRIBUTE A (BLOCK) ONTO R       8    
             
  R(3) 7   9   7
CDVM$  DISTRIBUTE (BLOCK) ONTO R :: B   8   10   8
    9   11   9
        12    
CDVM$  DISTRIBUTE C (BLOCK)            
  R(4) 10   13   10
    11       11
    12        
           

4.1.2 GEN_BLOCK format

Distribution by blocks of different sizes allows affecting on processor loading balance for algorithms performing different volume of computations for different parts of arrays.

Let NB( 1:P ) be an integer array. The following directive

CDVM$	DISTRIBUTE  A( GEN_BLOCK(NB))  ONTO  R

splits array A on P blocks. The block i of size NB( i ) is mapped on processor R( i ).

Here

Example 4.2. Distribution by blocks of different size.

    A
  R(1) 1
CDVM$  PROCESSORS R( 4 )   2
     
  R(2) 3
      INTEGER BS(4)   5
    6
      REAL A(12)    
  R(3) 7
    8
CDVM$   DISTRIBUTE A ( GEN_BLOCK( BS ) ) ONTO R   9
    10
     
     DATA BS / 2, 4, 4, 2 / R(4) 11
    12

4.1.3 WGT_BLOCK format

The WGT_BLOCK format specifies distribution by blocks according to their relative "weights".

Let WGT_BLOCK(WB, NBL) format is specified.

WB(i) defines weight of i -th block for 1£ i £ NBL. The blocks are distributed on P processors with balancing of sums of block weights on every processor. The condition

P £ NBL

must be satisfied.

The processor weight is defined as a sum of weights of all the blocks distributed on it . The array dimension is distributed proportionally to processor weights.

BLOCK format is special space of WGT_BLOCK(WB,P) format, where WB(i) = 1 for 1£ i £ P and NBL = P.

GEN_BLOCK format is special case of WGT_BLOCK format with some precision.

The example 4.2 can be rewritten using WGT_BLOCK format in the following way.

Example 4.3. Distribution by blocks according to weights.

CDVM$ PROCESSORS R( 4 )
      DOUBLE PRECISION WB(12)
      REAL A(12)
CDVM$ DISTRIBUTE A ( WGT_BLOCK( WB, 12 ) ) ONTO R
      DATA WB / 2., 2., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2. /

In the example 4.3 P = 4 and distribution is identical to the example 4.2.

As distinct from distribution by non-equal blocks, distribution by WGT_BLOCK format can be performed for any number of processors from range £  P £  NBL. For given example the size of processor array R can be varied from 1 to 12.

4.1.4 Format of *

Format of “*” means, that a dimension is localized on each processor (non-distributed or local dimension).

4.1.5 Multidimensional distributions

For multidimensional distributions mapping format is specified for each dimension. The following correspondence is established between dimensions of the array to be distributed and the processor arrangement.

Let the processor arrangement has n dimensions. Number the dimensions not formatted as * from left to right d1, ..., dk. Then dimension di will be mapped on i-th dimension of processor arrangement. The condition k£ n must be satisfied.

Example 4.4. One-dimensional distribution.

CDVM$   PROCESSORS R1( 2 )   Blocks A Processors
   REAL A(100,100)          
CDVM$   DISTRIBUTE A (BLOCK, *) ONTO R1 1 A( 1: 50,1:100) 1 1
  2 A(51:100,1:100) 2 2

Example 4.5. Two-dimensional distribution.

CDVM$   PROCESSORS R2( 2, 2 ) Blocks A Processors
  REAL A(100,100)       1 2
CDVM$   DISTRIBUTE A (BLOCK,BLOCK) ONTO R2 1 2 1 1 2
  3 4 2 3 4

4.2 Distribution of dynamic arrays

4.2.1 Dynamic arrays in Fortran 77 program

There are no features to deal with dynamic arrays in Fortran 77. So a user is forced to simulate dynamic memory by so called working arrays. The dynamic memory is declared as a one-dimensional array of large size. The dynamic arrays of different shape are defined as continuous segments in the working array.

Example 4.6. Working array usage.

	REAL  W(100000)
	READ (6 , *)   N,  M
C	the arrays of size N*N and M*M are required in the program
	CALL SUB1(W(1), N, W(1+N*N), M)
	END

	SUBROUTINE SUB1(A, N, B, M)
	DIMENSION  A(N,N),  B(M,M)
	.   .   .
	END

An analysis of available programs showed that there is no certain discipline when dealing with simulated dynamic arrays. In particular, array allocation in memory is not fixed explicitly. An access to dynamic array is implemented by reference to the working array. Therefore the compiler can't to determine the array shape.

4.2.2 Dynamic arrays in FDVM model. POINTER directive

Suggested model is a subset of dynamic array model in Fortran 90. It allows to execute this model without modifications in three program environments:

For dynamic arrays, distributed by default, FDVM allows to use any methods of dynamic memory simulation. For the dynamic arrays, distributed by DISTRIBUTE and ALIGN directives, FDVM defines the following discipline of allocation and usage of dynamic arrays.

* All explicitly distributed arrays are allocated in dynamic memory pool with the name HEAP

REAL HEAP (MAXM)

where MAXM is a number of dynamic memory words.

* Data type and the rank of the dynamic array are fixed by the following FDVM directive

pointer-directive is      type , POINTER ( dimension-list ) :: pointer-name-list
   
dimension is      :
   
pointer-name is      scalar-int-variable-name
  or     int-array-name

The specification type defines data type of the dynamic array. The following constraints exist for the variables specified in POINTER directive.

* The sizes of each dimension and dynamic array allocation in HEAP are fixed by the following statement

pointer = ALLOCATE ( sdim ,... )

where

pointer - is a pointer to integer variable (scalar or array element) with POINTER attribute
sdim - is a integer one-dimensional array of the size ndim. ndim is a rank of multidimensional array, allocated in dynamic memory HEAP. The sdim( i ) value defines the size of i-th dimension. The size of allocated segment is equal to sdim( 1 ) ΄ sdim( 2 ) ΄ ....΄ sdim( ndim ).

Integer function ALLOCATE returns a number of initial element of allocated segment of dynamic memory HEAP. The ALLOCATE function is programmed by a user, therefore it can have additional arguments besides obligatory sdim one.

* Only the following kind of a dynamic array reference is allowed in a procedure, where dynamic array allocation is performed

HEAP(pointer)

Moreover, this reference can be used only as the actual argument of a function or subroutine call.

Let several pools with identifiers ID1 , …, IDn. are used in a program. There is no necessity to rewrite the program with single pool HEAP of dynamic memory. It is enough to write following specification

CDVM$ HEAP ID1 , …, IDn

But every pool IDi must satisfy requirements above for HEAP pool. The pool can keep only the arrays, distributed by DISTRIBUTE and ALIGN directives.

4.2.3 DISTRIBUTE and REDISTRIBUTE directives for dynamic arrays

Distribution of dynamic arrays is performed by DISTRIBUTE and REDISTRIBUTE directives. The syntax of directives is extended in the following way:

distributee is      . . .
  or     pointer-name

If a scalar or the array with POINTER attribute are specified as distributee, then distribution is postponed up to execution of ALLOCATE function, assigning a value to the pointer. Instead of ALLOCATE function, the array creation and distribution is done according to DISTRIBUTE directive formats.

The REDISTRIBUTE directive for dynamic array can be performed only after execution of the ALLOCATE function, assigning a value to correspondent POINTER variable.

If dynamic array pointer is an element of a pointer array, the dynamic array can be distributed by REDISTRIBUTE directive only. As only a reference to pointer name is allowed in REDISTRIBUTE directive, the element of the pointer array should be assigned previously to scalar variable-pointer. The array with PT(I) pointer can be distributed by the following statements sequence:

	P1 = PT( I )
CDVM$	REDISTRIBUTE P1( BLOCK, BLOCK )	

It is necessary to transform the program from example 4.5 in the following FDVM program.

Example 4.7. Mapping of FDVM dynamic arrays.

	REAL  HEAP(100000)
	INTEGER  ALLOCATE
CDVM$	REAL, POINTER (:,:) :: PA, PB
	INTEGER  PA,  PB
C	descriptors of dynamic arrays
	INTEGER  DESCA(2),  DESCB(2)
CDVM$	DISTRIBUTE (BLOCK,BLOCK) :: PA, PB
C	arrays of size N*N and M*M are required in the program
	READ (6 , *)  N,  M
C	shape of the first array
	DESCA(1) = N
	DESCA(2) = N
C	allocation and distribution of the first array
	PA = ALLOCATE(DESCA,1)
C	shape of the second array
	DESCB(1) = M
	DESCB(2) = M
C	allocation and distribution of the second array
	PB = ALLOCATE(DESCB, N*N+1)
	CALL SUB1(HEAP(PA), N, HEAP(PB), M)
	END

	SUBROUTINE SUB1(A, N, B, M)
	DIMENSION  A(N,N),  B(M,M)
CDVM$	DISTRIBUTE (BLOCK,BLOCK) ::  A, B
	.   .   .
	END

	FUNCTION  ALLOCATE(DESC, P)
	INTEGER  DESC(2), P
	ALLOCATE = P
	END

Other examples of dynamic array distribution see in section 7.7.

4.3 Distributing by aligning

Aligning array A with distributed array B brings in accordance to each element of array A an element or a section of array B. When array B is distributed array A will be distributed simultaneously. If element of B is mapped on the processor, the element of A, corresponding to element B via alignment, will be also mapped on the same processor.

Method of mapping via alignment performs the following two functions.

  1. The same distribution of the arrays of the same shape on one processor arrangement does not always guarantee allocation of corresponding elements on the same processor. It forces to specify remote access (see section 6) where it is possible not exist. Only alignment of corresponding elements of the arrays guarantees their allocation on the same processor.
  2. Several arrays can be aligned with the same array. Redistribution of one array by REDISTRIBUTE directive will cause corresponding redistribution of the array group.

4.3.1 ALIGN and REALIGN directives

The following directives describe array aligning:

align-directive is      align-action alignee align-directive-stuff
  or     align-action [ align-directive-stuff ] :: alignee-list
align-action is      ALIGN
  or     REALIGN
   
align-directive-stuff is      ( align-source-list ) align-with-clause
   
alignee is      array-name
   
align-source is      *
  or     align-dummy
   
align-dummy is      scalar-int-variable
   
align-with-clause is      WITH align-spec
   
align-spec is      align-target ( align-subscript-list )
   
align-target is      array-name
  or     template-name
   
align-subscript is      int-expr
  or     align-dummy-use
  or     *
   
align-dummy-use is      [ primary-expr * ] align-dummy
          [ add-op primary-expr ]
   
primary-expr is      int-constant
  or     int-variable
  or     ( int-expr )
   
add-op is      +
  or     -

Constraints:

Let the alignment of two arrays is specified by the directive

CDVM$	ALIGN  A(d1,…,dn)  WITH  B(ard1,…,ardm)

where di is specification of i-th dimension of aligned array A,
ardj is specification of j-th dimension of base array B.

If di is specified by integer variable I, then there must be at most one dimension of array B, specified by linear function ardj = a*I + b.

Let i-th dimension of array A has bounds LAi : HAi , and j-th dimension of array B, specified by linear function a*I + b , has the bounds LBj : HBj. Since the parameter I is defined on the value set LAi : HAi, then the following conditions must be satisfied:

a*LAi + b ³ LBj , a* HAi + b £ HBj

If di is * , the i-th dimension of array A will be local on each processor independently from array B distribution (it is analogue of local dimension in DISTRIBUTE directive).

If ardi is * , then the array A will be replicated along j-th dimension of the array B.

If ardi is int-expr, then array A is aligned with the section of the array B.

Example 4.8. Aligning arrays

	REAL A(10),B(10,10),C(22,22),D(20),E(20),F(10),G(20),H(10,10)
CDVM$	DISTRIBUTE  B (BLOCK,BLOCK)
CDVM$	DISTRIBUTE  D (BLOCK)
C	aligning with the array section
C	(the vector is aligned with the first line of A)
CDVM$	ALIGN  A( I )  WITH  B(1,I) 
C	replication of the vector aligning it with each line 
CDVM$	ALIGN  F( I )  WITH  B(*,I) 
C	the matrix is collapsed; 
C	each matrix column corresponds to the vector element
CDVM$	ALIGN  C(*,I)  WITH  D( I ) 
C	alignment of vector  with vector using stretching
CDVM$	ALIGN  E( I )  WITH  D(2*I) 
C	alignment vector with vector using reverse ordering
CDVM$	ALIGN  G( I )  WITH  D(-I+21) 
C	alignment matrix with matrix using rotation and stretching 
CDVM$	ALIGN  H(I,J)  WITH  C(2*J,2*I) 

Several arrays (A1, A2,…) can be aligned with one B array in the same manner by the directive of the form:

CDVM$	ALIGN  (d1,…,dn)  WITH  B(ard1,…,ardm) :: A1, A2, …

At that A1, A2… arrays must have the same rank (n), but can have different sizes of dimensions.

Let the sequence of alignments A f1 B f2 C, be specified; f2 is alignment of the array B with the array C, and f1 is alignment of the array A with the array B. The arrays A and B are considered as aligned with array C by definition. The array B is aligned by function f2 directly and array A is aligned by composite function f1(f2) indirectly. Therefore applying REALIGN directive to the array B doesn't cause redistribution of array A.

Generally a set of ALIGN specifications is a set of trees. At that every tree root must be distributed by DISTRIBUTE or REDISTRIBUTE directives. When REDISTRIBUTE directive is executed, the whole alignment tree is redistributed.

4.3.2 TEMPLATE directive

If values of linear function a*I + b are beyond base array dimension, it is necessary to define a dummy array - referred to as an alignment template – using the following directive.

template-directive is     TEMPLATE template-decl-list
   
template-decl is      template-name [ ( explicit-shape-spec-list ) ]

Then it is necessary to align both arrays with the template. The template is distributed by DISTRIBUTE and REDISTRIBUTE directives. The template elements are mapped among processors without real memory allocation. They specify a processor on which elements of aligned arrays must be mapped.

Consider the following example.

Example 4.9. Aligning with template.

	REAL  A(100), B(100), C(100)
CDVM$	TEMPLATE  TABC(102)
CDVM$	ALIGN  B( I )  WITH  TABC( I ) 
CDVM$	ALIGN  A( I )  WITH  TABC( I+1 ) 
CDVM$	ALIGN  C( I )  WITH  TABC( I+2 ) 
CDVM$	DISTRIBUTE  TABC ( BLOCK )
	.   .   .
	DO 10  I = 2, 98
	   A(I) = C(I-1) + B(I+1)
10	CONTINUE

To avoid exchange between processors, it is necessary to allocate the elements A(I), C(I-1) and B(I+1) on the same processor. It is impossible to align arrays C and B with array A, because alignment functions I-1 and I+1 cause bounds violation of array A. Therefore the template TABS is declared. The elements of arrays A, B and C, which must be allocated on the same processor, are aligned with the same element of the template.

4.3.3 Aligning dynamic arrays

To specify dynamic arrays alignment the syntax of ALIGN and REALIGN directives is extended in the following way.

alignee is      . . .
  or     pointer-name
align-target is      . . .
  or     pointer-name

If a variable with POINTER attribute is specified as aligned array (alignee) in ALIGN directive, then directive executing is postponed up to execution of ALLOCATE function, defining the variable value. The REALIGN directive may be executed only after execution of the ALLOCATE function.

Example 4.10. Aligning dynamic arrays.

	REAL  HEAP(100000)
CDVM$	REAL, POINTER (:,:) :: PX, PY
	INTEGER  PX, PY, DESC(2)
CDVM$	ALIGN  PY(I,J )  WITH  PX(I,J) 
CDVM$	DISTRIBUTE  PX (BLOCK,BLOCK)
		.   .   .
	PX = ALLOCATE(DESC, ...)
	PY = ALLOCATE(DESC, ...)
		.   .   .
CDVM$	REDISTRIBUTE  PX (BLOCK,*)

Let sequence of alignments by ALIGN directives is specified

P1 f1 P2 f2 . . . fN-1 PN

where:
fi is aligning function,
Pi is a pointer to dynamic array.

Then the order of dynamic array allocation (ALLOCATE function execution) must be reverse, i.e.:

PN = ALLOCATE(...)
. . .
P2 = ALLOCATE(...)
P
1 = ALLOCATE(...)

If dynamic array pointer is an element of a pointer array, the dynamic array can be aligned by REALIGN directive only. As only a reference to pointer name is allowed in REALIGN directive, the element of pointer array should be assigned previously to scalar variable-pointer. The array with pointer PT(I) can be aligned with the array with pointer PT(J) by the following statements sequence:

	P1 = PT(I)
	P2 = PT(J)
CDVM$	REALIGN  P1(I,J)  WITH  P2(I+1,J)	

4.4 DYNAMIC and NEW_VALUE directives

The arrays, redistributed by REDISTRIBUTE and REALIGN directives, should be specified in DYNAMIC directive.

dynamic-directive is      DYNAMIC alignee-or-distributee-list
   
alignee-or-distributee is      alignee
  or     distributee

If after REDISTRIBUTE and REALIGN directive execution new values will be assigned to the arrays, additional (optimizing) directive NEW_VALUE must precede these directives.

new-value-directive is      NEW_VALUE

The directive cancels reassigning the redistributed array values.

If the array is specified in DYNAMIC directive and there is no DISTRIBUTE or ALIGN specification for it, its distribution is postponed up to the first REDISTRIBUTE or REALIGN statement. It is neccessary in two cases.

4.5 Default distribution

If the data are not specified in DISTRIBUTE or ALIGN directive , they are distributed on each processor (full replication). The same distribution can be defined by DISTRIBUTE directive with format of * for each dimension. But in that case the access to the data will be less effective.


Fortran DVM - contents Part 1 (1-4) Part 2 (5-6) Part 3 (7-12) Part 4 (Annexes)