Lib-DVM interface description (contents) | Part 1 (1-5) |
Part 2 (6-7) |
Part 3 (8-11) |
Part 4 (12-13) |
Part 5 (14-15) |
Part 6 (16-18) |
Part 7 (19) |
created: february, 2001 | - last edited 03.05.01 - |
12 Renewing shadow edges of distributed array
Let the local part of the distributed array be represented as an aggregate of its elements defined as a set of the index tuples:
{I_{1} Î M_{1}:
I_{1,init} £ I_{1 }£ I_{1,last}_{ }}
´ . . . ´ {I_{m} Î M_{m}:
I_{m,init} £ I_{m} £ I_{m,last}} ´ . . . ´
{I_{n} Î M_{n}:
I_{n,init} £ I_{n }£ I_{n,last}} ,
where:
´ | - | symbol of Cartesian product; |
n | - | rank of the array; |
I_{m} | - | index variable of the m-th dimension (1 £ m £ n); |
I_{m,init} | - | the initial value of the index variable of the m-th dimension; |
I_{m,last} | - | the last value of the index variable of the m-th dimension; |
M_{m} | - | the range of values of the index variable of the m-th dimension. |
Suppose that the local part is entirely inside the array (for simplicity). Then low shadow edge of the local part of distributed array of k-th dimension is a set of its elements, defined by a set of the index corteges:
LSB_{k} =
{ I_{1} | Î M_{1} | : | I_{1,init} | - | FS*LW_{1} | £ I_{1} | £ I_{1,last} | + | FS*HW_{1} | } ´ | |||
. . . | . . . . . . . | . . . . . | . . . . . . . . | . . . . . | . . . . . . . . . | . . . . . . . . | |||||||
{ I_{k-1} | Î M_{k-1} | : | I_{k-}_{1}_{,init} | - | FS*LW_{k-1} | £ I_{k-1} | £ I_{k-1,last} | + | FS*HW_{k-1} | } ´ | |||
{ I_{k} | Î M_{low,k} | : | I_{k,init} | - | LW_{k} | £ I_{k} | £ I_{k,last} | - | 1 | } ´ | |||
{ I_{k+1} | Î M_{k+1} | : | I_{k+}_{1}_{,init} | - | FS*LW_{k+1} | £ I_{k+1} | £ I_{k+1,last} | + | FS*HW_{k+1} | } ´ | |||
. . . | . . . . . . . | . . . . . | . . . . . . . . | . . . . . | . . . . . . . . . | . . . . . . . . | |||||||
{ I_{n} | Î M_{n} | : | I_{n,init} | - | FS*LW_{n} | £ I_{n} | £ I_{n,last} | + | FS*HW_{n} | } |
Here: | LW_{i} | - | width of the low part of the shadow edge of i-th dimension; | ||||||||||
HW_{i} | - | width of the high part of the shadow edge of i-th dimension (parameters LowShdWidthArray and HiShdWidthArray of the functions crtda_ , section 6, and inssh_ , section12.2); | |||||||||||
FS | - | flag of full edge (parameter *FullShdSignPtr of the function inssh_, section 12.2). |
Similarly high shadow edge of the local part of the distributed array of k-th dimension is defined by the set of index corteges:
HSB_{k} =
{ I_{1} | Î M_{1} | : | I_{1,init} | - | FS*LW_{1} | £ I_{1} | £ I_{1,last} | + | FS*HW_{1} | } ´ |
. . . | . . . . . . . | . . . . . | . . . . . . . . | . . . . . | . . . . . . . . . | . . . . . . . . | ||||
{ I_{k-1} | Î M_{k-1} | : | I_{k-}_{1}_{,init} | - | FS*LW_{k-1} | £ I_{k-1} | £ I_{k-1,last} | + | FS*HW_{k-1} | } ´ |
{ I_{k} | Î M_{high,k} | : | I_{k,last} | + | 1 | £ I_{k} | £ I_{k,last} | + | HW_{k} | } ´ |
{ I_{k+1} | Î M_{k+1} | : | I_{k+}_{1}_{,init} | - | FS*LW_{k+1} | £ I_{k+1} | £ I_{k+1,last} | + | FS*HW_{k+1} | } ´ |
. . . | . . . . . . . | . . . . . | . . . . . . . . | . . . . . | . . . . . . . . . | . . . . . . . . | ||||
{ I_{n} | Î M_{n} | : | I_{n,init} | - | FS*LW_{n} | £ I_{n} | £ I_{n,last} | + | FS*HW_{n} | } |
The low (high) shadow edge of k-th dimension is called full, if FS=1, and is called low (high) shadow bound, if FS=0. The union of full shadow edges of all dimensions is called full shadow edge of the local part of the distributed array:
FSB = |
n U k = 1 |
( LSB_{k,FS=1}_{ }U HSB_{k,FS=1 }) = |
||||||||
_{ }_{ n} | ||||||||||
U ( | { I_{1} | Î M_{1} | : I_{1,init} | - LW_{1} | £ I_{1} | £ I_{1,last} | + HW_{1} | } ´ | ||
^{k = 1} | …. | .....….. | ........ | ………. | .......... | ........……. | ..........… | |||
{ I_{k-1} | Î M_{k-1} | : I_{k-1,init} | - LW_{k-1} | £ I_{k-1} | £ I_{k-1,last} | + HW_{k-1} | } ´ | |||
{ I_{k} | Î M_{k} | : I_{k,init} | - LW_{k} | £ I_{k} | £ I_{k,init} | - 1 ; | ||||
I_{k,last} | + 1 | £ I_{k} | £ I_{k,last} | + HW_{k} | } ´ | |||||
{ I_{k+1} | Î M_{k+1} | : I_{k+1,init} | - LW_{k+1} | £ I_{k+1} | £ I_{k+1,last} | + HW_{k+1} | } ´ | |||
.... | ....…... | ........ | ......…… | .......... | ....……..... | .........…. | ||||
{ I_{n} | Î M_{n} | : I_{n,init} | - LW_{n} | £ I_{n} | £ I_{n,last} | + HW_{n} | } | |||
) |
Elements of the shadow edge of the local part of the distributed array are called shadow edge elements (or imported). These elements are allocated in the memory together with the local part. In the another side, each edge element belongs to some local part of the distributed array, and so it is allocated in the memory of the processor, this local part is mapped on. For this processor the element is called element-original (or exported element).
Renewing the shadow edges of the distributed array local parts is an asynchronous operaôion (that is operation executed in parallel with computations). During this operation each element-original (possibly updated) is copied to the corresponding shadow edge element (possibly obsolete). For optimization purpose, the shadow edges of the distributed arrays are combined in a group, and shadow edge exchange is implemented as a set of operations, executed over the specified group of the shadow edges.
12.1 Creating shadow edge group
ShadowGroupRef crtshg_(long *StaticSignPtr);
*StaticSignPtr - the flag of the static shadow edge group creation.
The function crtshg_ creates empty shadow edge group (that is group that does not contain any shadow edge). The function returns reference to the created group.
If the flag *StaticSignPtr of the static shadow edge group creation is not equal to zero, then the created group does not deleted automatically when the control exits from the current program block (see sections 8). Such shadow edge group has to be deleted explicitly using the function delshg_.
12.2 Including shadow edge in the group
long inssh_ ( | ShadowGroupRef long long long long |
*ShadowGroupRefPtr, ArrayHeader[], LowShdWidthArray[], HiShdWidthArray[], *FullShdSignPtr ); |
*ShadowGroupRefPtr | - | reference to the shadow edge group. | ||
ArrayHeader | - | the header of the distributed array. | ||
LowShdWidthArray | - | array, which i-th element is the width of the low shadow edge of the (i+1)-th dimension of the array. | ||
HiShdWidthArray | - | array, which i-th element is the width of the high shadow edge of the (i+1)-th dimension of the array. | ||
*FullShdSignPtr | - | flag of full shadow edge renewing (if it is equal to one). |
Including distributed array shadow edge in the group means only registration of this shadow edge as a member of shadow edge group. Run-Time System does not storage the values of this shadow edge in the system renewing buffer.
Specified in the function call shadow edge group must be created in the current subtask. Before including in the shadow edge group the distributed array must be mapped on the processor system (by align_, realn_, malign_ or mrealn_ functions), which each element must belong to the current processor system.
The distributed array can be included in several shadow edge group and also reincluded in the same group. In the last case shadow edge widths, specified in inssh_ function call, must be equal to shadow edge widths, specified in previous inclusion in this group. The array can't be included in the group, being in state of shadow edge renewing or in the group, being in state of imported or local element receiving and exported or boundary element sending. New elements can be included in such group only after completion of these operations (see strtsh_, recvsh_, sendsh_, recvla_, sendsa_ and waitsh_ function description in sections 12.3-12.8).
The widths of the array shadow edges defined by the LowShdWidthArray and HiShdWidthArray parameters should not fall outside the limits of the shadow edge widths defined in the function crtda_ when the array was created. If the shadow edge width is equal to -1, then the width defined in the function crtda_ is used.
If *FullShdSignPtr = 0, then only low and high shadow edges of all dimensions of the distributed array participate in shadow edge exchange operations. If *FullShdSignPtr = 1, then the full shadow edge of the distributed array participates.
The function returns zero.
The exchange of shadow edges, that is the union of shadow bounds, requires a communication of the current processor with 2*n "neighbors" (n is the rank of the distributed array, 2*n is a number of shadow bounds), and exchange of the full shadow edges requires the communication with 3^{n}-1 "neighbors". So if the shadow edge as union of shadow bounds don't cover task needs and full shadow edge exchange is inadmissible because of overheads, Run-Time System provides a possibility to choice sufficient and optimal scheme of shadow edges exchange. The scheme is based on a representation of the shadow edge as union of elementary shadow n-dimensional parallelepipeds.
Let Q = (q_{1}, ... ,q_{k}, ... ,q_{n}) be n-combination with repetition from the elements of the set {0,1,2} (the combination (0, ... ,0) is not considered). Let n-dimensional parallelepiped P_{Q} = M_{1 }_{´}_{ }... ´ M_{k }_{´}_{ }... ´ M_{n }is corresponded to it, where:
M_{k} = | { | M_{0,k} = { I_{k} Î M_{0,k}: I_{k,init}_{ } | £ I_{k }_{ }£ I_{k,last} | } if q_{k} = 0, |
M_{1,k} = { I_{k} Î M_{1,k}: I_{k,init} – LW_{k} | £ I_{k} _{ }_{ }£ I_{k,init} – 1 | } if q_{k} = 1, | ||
M_{2,k} = { I_{k} Î M_{2,k}: I_{k,last} + 1 | £ I_{k} _{ }_{ }£ I_{k,last} + HW_{k} | } if q_{k} = 2. |
Here:
I_{k} | - | index variable of k-th dimension of distributed array (1 £ k £ n); |
I_{k,init} | - | local initial value of index variable of k-th dimension; |
I_{k,last} | - | local last value of index variable of k-th dimension; |
M_{k} | - | value set of index variable of k-th dimension of parallelepiped P_{Q}_{ }(1 = k = n); |
LW_{k} | - | width of low part of the edge of k-th dimension of distributed array; |
HW_{k} | - | width of high part of the edge of k-th dimension of distributed array. |
The parallelepiped P_{Q} is called elementary shadow n-dimensional parallelepiped.
The full shadow edge of the distributed array is a union of all elementary shadow parallelepipeds (except of the parallelepiped, corresponding to (0, ... ,0) combination and that is the local part of the distributed array). A number of elementary parallelepipeds, formed the full shadow edge, is A_{3}^{n} - 1 = 3^{n} - 1.
Low (high) shadow bound of k-th dimension is elementary shadow parallelepiped, corresponding to n-combination, where q_{k} = 1(2), and others q_{i} are equal to zero.
Run-Time System allows to specify any set of elementary shadow parallelepipeds for shadow edge exchange using function insshd_, including a shadow edge of a distributed array into the edge group and described below.
long insshd_ ( | ShadowGroupRef long long long long long |
*ShadowGroupRefPtr, ArrayHeader[], LowShdWidthArray[], HiShdWidthArray[], *MaxShdCountPtr, ShdSignArray[] ); |
*ShadowGroupRefPtr | - | reference to the shadow edge group. | ||
ArrayHeader | - | header of the distributed array. | ||
LowShdWidthArray | - | array, which i-th element is the width of the low shadow edge of the (i+1)-th dimension of the array. | ||
HiShdWidthArray | - | array, which i-th element is the width of the high shadow edge of the (i+1)-th dimension of the array. | ||
*MaxShdCountPtr | - | maximal possible number of dimensions which code of participation in shadow edge forming more than one for any elementary parallelepiped included in the shadow edge. | ||
ShdSignArray | - | array, which i-th element contains participation flag of the (i+1)-th dimension of the array in shadow edge forming. |
The widths of the array shadow edges, specified by the parameters LowShdWidthArray and HiShdWidthArray, must not be more, then shadow edge widths, specified when the considered array was created by the function crtda_. If -1 is set as shadow edge width, then shadow edge width, specified when the array was created, will be used.
A participation flag of (i+1)-th dimension in the edge forming, specified in i-th element of array ShdSignArray, can have the following values:
1 | - | the array shadow edge contains only those elementary parallelepipeds, for which M_{i} = M_{0,i}; |
2 | - | the shadow edge contains only those elementary parallelepipeds, for which M_{i} = M_{1,i}; |
3 | - | the shadow edge contains only those elementary parallelepipeds, for which M_{i} = M_{0,i} or M_{i} = M_{1,i}; |
4 | - | the shadow edge contains only those elementary parallelepipeds, for which M_{i} = M_{2,i}; |
5 | - | the shadow edge contains only those elementary parallelepipeds, for which M_{i} = M_{0,i} or M_{i} = M_{2,i}; |
6 | - | the shadow edge contains only those elementary parallelepipeds, for which M_{i} = M_{1,i} or M_{i} = M_{2,i}; |
7 | - | the shadow edge must contain elementary parallelepipeds, with any set M_{i} (M_{0,i}, M_{1,i} or M_{2,i}). |
The parameter *MaxShdCountPtr (positive number) restricts shadow shell of the local part of the distributed array, participating in the shadow edge exchange, by allowing to include in the shell only those elementary parallelepipeds, for which
That is, elementary parallelepipeds, for which a number of dimensions, violating the array local part, is more than *MaxShdCountPtr, can't be included in the shadow edge, participating in the exchange.
The array ShdSignArray must contain at least one element, more than 1 (shadow edge of distributed array must not coincide with its local part).
The function returns zero.
Note. Including shadow edges of distributed array in the shadow edge group using the function inssh_ is equivalent to execution of the function insshd_, if the parameter *MaxShdCountPtr = 1 and all elements of the array ShdSignArray are equal to 7. To include the full shadow edge of the distributed array in the shadow edge group using the function insshd_ the parameter *MaxShdCountPtr has be equal to the distributed array rank, and all elements of the array ShdSignArray has also to be equal to 7.
12.3 Starting shadow edge group renewing
long strtsh_ (ShadowGroupRef *ShadowGroupRefPtr);
*ShadowGroupRefPtr - reference to the shadow edge group.
The function strtsh_ initializes the system buffer of renewing (if the shadow edge group is renewed at first time) and starts the shadow edge renewing operation for all shadow edges registered by the function inssh_.
All elements of all processor systems, the arrays, registered in started group, are mapped on, must belong to the current processor system. Renewing of shadow edges of specified group can be started only if previous operations of shadow edge renewing of the group, local or imported element receiving and boundary or exported element sending have been completed by waitsh_ function (see sections 12.4-12.8).
The function returns zero.
12.4 Initializing receiving imported elements of specified shadow edge group
long recvsh_(ShadowGroupRefPtr *ShadowGroupRefPtr);
*ShadowGroupRefPtr - reference to a shadow edges group.
All elements of all processor systems the, arrays, registered in started group, are mapped on, must belong to the current processor system. Receiving imported elements of the group can be started only if previous operations of shadow edge renewing of the group, imported element receiving and boundary element sending have been completed by waitsh_ function (see sections 12.3, 12.7 and 12.8).
Receiving imported elements can be performed in parallel with sending of exported elements or receiving of local elements (see sections 12.5 and 12.6) under condition, that the operations were started (or will be started) by the current subtask.
Waiting for completion of receiving imported elements is performed by the function waitsh_ (see section 12.8).
The function returns zero.
12.5 Initializing sending of exported elements of specified shadow edges group
long sendsh_ (ShadowGroupRefPtr *ShadowGroupRefPtr);
*ShadowGroupRefPtr - reference to the shadow edges group.
All elements of all processor systems, the arrays, registered in specified group, are mapped on, must belong to the current processor system. Sending exported elements of the group can be started only if previous operations of shadow edge renewing of the group, sending exported elements and receiving local elements have been completed by waitsh_ function (see sections 12.3, 12.6 and 12.8).
Sending exported elements can be performed in parallel with receiving of imported elements or sending of boundary elements (see sections 12.4 and 12.7) under condition, that the operations were started (or will be started) by the current subtask.
Waiting for completion of sending exported elements is performed by the function waitsh_ (see section 12.8).
The function returns zero.
Note. The sequence of function calls
recvsh_(ShadowGroupRefPtr);
sendsh_(ShadowGroupRefPtr);
is equivalent to the call strtsh_(ShadowGroupRefPtr).
12.6 Initializing receiving of local elements of distributed arrays of specified shadow edge group
long recvla_(ShadowGroupRefPtr *ShadowGroupRefPtr);
*ShadowGroupRefPtr - reference to the shadow edges group.
The function recvla_ starts receiving of boundary elements from "adjacent" processors into local parts of distributed arrays of given shadow edge group. i.e. a distinction between functions recvla_ and sendsh_ (considered in section 12.5) is exchange direction (receiving instead of sending).
All elements of all processor systems the, arrays, registered in started group, are mapped on, must belong to the current processor system. Receiving local elements of the group can be started only if previous operations of shadow edge renewing of the group, receiving local elements and sending exported elements were completed by waitsh_ function (see sections 12.3, 12.5 and 12.8).
Receiving local elements can be performed in parallel with sending boundary elements or receiving imported elements (see sections 12.7 and 12.4) under condition, that the operations were started (or will be started) by the current subtask.
Waiting for completion of receiving local elements is performed by the function waitsh_ (see section 12.8).
The function returns zero.
12.7 Initializing sending boundary elements of distributed arrays of specified shadow edge group
long sendsa_(ShadowGroupRefPtr *ShadowGroupRefPtr);
*ShadowGroupRefPtr - reference to the shadow edges group.
The function sendsa_ starts sending boundary elements of distributed arrays of given shadow edge group to corresponding local elements, located on "adjacent" processors i.e. a distinction between functions recvla_ and sendsh_ (considered in section 12.4) is exchange direction (sending instead of receiving).
All elements of all processor systems the, arrays, registered in started group, are mapped on, must belong to the current processor system. Sending boundary elements of the group can be started only if previous operations of shadow edge renewing of the group, sending boundary elements and receiving imported elements were completed by waitsh_ function (see sections 12.3, 12.4 and 12.8).
Sending boundary elements can be performed in parallel with receiving of local elements or sending of exported elements (see sections 12.6 and 12.5) under condition, that the operations were started (or will be started) by the current subtask.
Waiting for completion of sending boundary elements is performed by the function waitsh_ (see section 12.8).
The function returns zero.
12.8 Waiting for completion of shadow edge group renewing
long waitsh_(ShadowGroupRef *ShadowGroupRefPtr);
*ShadowGroupRefPtr - reference to the shadow edge group.
The function waitsh_ completes shadow edge renewing operation, initialized by strtsh_ function, and allows to start new shadow edge renewing of the group. The function also waits for completion of local or imported element receiving and boundary or exported element sending, started by recvla_, recvsh_, sendsa or sendsh_ functions (see sections 12.4-12.7). After waitsh_ function execution the group becomes opened for distributed arrays including.
Waiting for completion of shadow edge renewing and also operations of local (recvla_) or imported (recvsh_) element receiving and boundary (sendsa_) or exported (sendsh_) element sending can be performed only by the task, that started corresponding operation.
The function returns zero.
12.9 Deleting shadow edge group
long delshg_(ShadowGroupRef *ShadowGroupRefPtr);
*ShadowGroupRefPtr - reference to the shadow edge group.
The function delshg_ deletes the shadow edge group created by the function the crtshg_. After deleting of the group the reference to this group can be used by user program for any goal.
The reduction variable can be deleted by delshg_ function only if it was created in the current subtask and in the current program block (or its sub-block) (see sections 8 and 10). The shadow edge group can't be deleted, if previously started operations of shadow edge renewing, boundary or exported element sending and local or imported element receiving are not completed (see sections 12.3-12.7).
To delete shadow edge group the function delobj_ can also be used (see section 17.5).
The function returns zero.
13 Access to distributed array elements
13.1 Coping distributed array element
13.1.1 Reading distributed array element and assigning value to element
long rwelm_( | long long long |
FromArrayHeader[], ToArrayHeader[], IndexArray[] ); |
FromArrayHeader | - | the header to the source distributed array or the pointer to the source memory area. | ||
ToArrayHeader | - | the header to the distributed array, which contains the target element, or the pointer to the target memory area. | ||
IndexArray | - | array, which i-th element is an index of the source or target element on the (i+1)-th dimension. |
If the function rwelm_ is used to read the element of the distributed array, then:
FromArrayHeader | - | the header of the distributed array containing the source element; |
ToArrayHeader | - | pointer to the target memory area; |
IndexArray | - | array of the indexes of the source element. |
If the function rwelm_ is used to assign the value to the element of the distributed array, then:
FromArrayHeader | - | pointer to the source memory area; |
ToArrayHeader | - | the header of the distributed array, containing the target element; |
IndexArray | - | array of the indexes of the target element. |
Reading is executed on all the processors. Writing (modification of the element of the distributed array) is executed only on the processors, where the element is allocated.
The number of the indexes in the array IndexArray has to be equal to the rank of the source or target array.
Specified in the function call distributed array (read or written) must be mapped in the processor system, which each element must belong to the current processor system.
The function returns the number of bytes actually read or written (that is the element size of the source or target array).
Note. To avoid warnings from Fortran-compiler when the function rwelm_ is called with different types of variables, the distributed array element will be assigned to, Run-Time System provides the function
long rwelmf_ ( | long AddrType long |
FromArrayHeader[], *ToArrayHeaderPtr, IndexArray[] ); |
distinguished from the function rwelm_ in the second parameter:
*ToArrayHeaderPtr | - | pointer to a memory area, the distributed array element cast to type AddrType by one of the functions from section 17.7 will be written in. |
Other parameters of the function rwelmf_ are similar to the corresponding parameters of the function rwelm_.
13.1.2 Copying one element of distributed array to another
long copelm_ ( | long long long long |
FromArrayHeader[], FromIndexArray[], ToArrayHeader[], ToIndexArray[] ); |
FromArrayHeader | - | the header of the source distributed array. | ||
FromIndexArray | - | array, which i-th element is the index of the source element on the (i+1)-th dimension. | ||
ToArrayHeader | - | the header of the target distributed array. | ||
ToIndexArray | - | array, which j-th element is the index of the target element on the (i+1)-th dimension. |
The types of the source and target elements have to be the same.
As read distributed array as written one must be mapped on the processor systems, which each element must belong to the current processor system.
The function returns the number of the copied bytes.
13.1.3 Unified coping of element of distributed array
long elmcpy_( | long long long long long |
FromArrayHeader[], FromIndexArray[], ToArrayHeader[], ToIndexArray[], *CopyRegimPtr ); |
FromArrayHeader | - | the header of the source distributed array, or the pointer to the source memory area. | ||
FromIndexArray | - | array, which i-th element is the index of the source element on the (i+1)-th dimension. | ||
ToArrayHeader | - | the header of the target distributed array. | ||
ToIndexArray | - | array, which j-th element is the index of the target element on the (i+1)-th dimension. | ||
*CopyRegimPtr | - | the mode of copying. |
The function elmcpy_ is a generalization of the more specialized functions rwelm_ and copelm_ discussed above.
If FromArrayHeader and ToArrayHeader are the headers of the distributed arrays, then the types of the elements of this arrays have to be the same.
If FromArrayHeader (ToArrayHeader) is the pointer to the memory area, then the values of the array FromIndexArray (ToIndexArray) is ignored. In this case the copying is controlled by the *CopyRegimPtr flag. If *CopyRegimPtr is not equal to zero, then the memory is assumed to be allocated on the I/O processor only. The pointers FromArrayHeader and ToArrayHeader must not be both the pointers to a memory area.
As read distributed array as written one must be mapped on the processor systems, which each element must belong to the current processor system.
The function returns the number of the copied bytes.
13.2 Coping distributed arrays
long arrcpy_( | long long long long long long long long long |
FromArrayHeader[], FromInitIndexArray[], FromLastIndexArray[], FromStepArray[], ToArrayHeader[], ToInitIndexArray[], ToLastIndexArray[], ToStepArray[], *CopyRegimPtr ); |
FromArrayHeader | - | the header of the source distributed array. | ||
FromInitIndexArray | - | array, which i-th element is the initial index value of the (i+1)-th dimension of the source array. | ||
FromLastIndexArray | - | array, which i-th element is the last index value of the (i+1)-th dimension of the source array. | ||
FromStepArray | - | array, which i-th element is the step of the index of the (i+1)-th dimension of the source array. | ||
ToArrayHeader | - | the header of the target distributed array. | ||
ToInitIndexArray | - | array, which j-th element is the initial index value of the (i+1)-th dimension of the target array. | ||
ToLastIndexArray | - | array, which j-th element is the last index value of the (i+1)-th dimension of the target array. | ||
ToStepArray | - | array, which j-th element is the step of the index of the (i+1)-th dimension of the target array. | ||
*CopyRegimPtr | - | the mode of copying. |
The copying is executed until the exhaustion of the source or target elements. The elements are copied according to the C language discipline of allocating of the elements in the memory, that is the right index is changed more faster then the left one. If the initial value of the index by some dimension of the source or target array is greater or equal to its last value, then the index of this dimension is not changed during copy operation. Note, that Run-Time System considers the last index value of any dimension as a minimum of the defined value in the function call and the real size of this dimension minus 1.
To use a full scope of the source or target array without requesting the size of the object by some dimension (see section 17.2), Run-Time System supposes that the value of the initial index value can be equal to -1. In that case, the initial index value is supposed to be equal to zero, the step be equal to 1, and the last index value be equal to the size of the dimension minus 1.
One of the arrays (but not the both) can be not distributed but rather a normal one replicated among all the processors (Run-Time System determines this case if the pointer of the array is not the pointer to the header of the distributed array). The types of the elements of replicated and distributed arrays must be the same. It is assumed that the replicated array is one-dimensional, and that the elements of this array are allocated in the memory consequently and continuously. Run-Time System ignores the parameters of the indexes of such array.
If the one of the arrays is not a distributed one, then the copy mode is determined by the value of the *CopyRegimPtr flag:
If both arrays are non-distributed ones, then the copying is not executed.
If both arrays are distributed ones, then these arrays can be different in the rank and in the size of each dimension, but the types of the element of the arrays have to be the same.
As read distributed array as written one must be mapped in the processor systems, which each element must belong to the current processor system.
The function returns the number of the copied elements.
13.3 Asynchronous coping distributed arrays
long arwelm_ ( | long long long AddrType |
FromArrayHeader[], ToArrayHeader[], IndexArráy[], *CopyFlagPtr ); |
||||
long arwelf_ ( | long AddrType long AddrType |
FromArrayHeader[], *ToArrayHeaderPtr, IndexArray[], *CopyFlagPtr ); |
||||
long acopel_( | long long long long AddrType |
FromArrayHeader[], FromIndexArray[], ToArrayHeader[], ToIndexArray[], *CopyFlagPtr ); |
long aelmcp_ ( | long long long long long AddrType |
FromArrayHeader[], FromIndexArray[], ToArrayHeader[], ToIndexArray[], *CopyRegimPtr, *CopyFlagPtr ); |
||||
long aarrcp_ ( | long long long long long long long long long AddrType |
FromArrayHeader[], FromInitIndexArray[], FromLastIndexArray[], FromStepArray[], ToArrayHeader[], ToInitIndexArray[], ToLastIndexArray[], ToStepArray[], *CopyRegimPtr, *CopyFlagPtr ); |
Described above functions arwelm_, arwelf_, acopel_, aelmcp_ and aarrcp_ starts coping operations, executed by the functions rwelm_, rwelmf_, copelm_, elmcpy_, and arrcpy_. All parameters of considered functions, except of the last one, are similar to the same named parameters of their synchronous analogous. The last parameter CopyFlagPtr is the pointer to the flag of completion of the started coping operation.
Waiting for completion of coping is performed by the function
long waitcp_ (AddrType *CopyFlagPtr);
*CopyFlagPtr - flag of coping operation completion, specified when starting the operation.
The function returns zero.
13.4 Access to elements of local part of distributed array
Let n be the rank of the distributed array. Then the header of this array in the function crtda_ call with zero value of *ExtHdrSignPtr parameter can be declared for example as:
long ArrayHeader[n+1]; /* standard header */
Let the base pointer corresponding to the type of the elements of the distributed array be BasePtr. Then the element (I_{1}, ... , I_{n}) from the local part of the distributed array can be accessed through linear index in the following manner:
For C the value of the base pointer in crtda_ function call may be NULL. In that case the element of the local part of the distributed array may be accessed in the following manner:
where Type is the type of the element of the distributed array.
Run-Time System calculates the coefficients ArrayHeader[1], ... ,ArrayHeader[n-1] and the address constant ArrayHeader[n] in the functions of the array alignment align_ and realn_ in the following manner.
Let:
I_{i,init} | - | the initial index value of the local part of the distributed array on the i-th dimension; |
I_{i,last} | - | the last index value of the local part of the distributed array on the i-th dimension; |
LW_{i} | - | the width of the low shadow edge of the distributed array on the i-th dimension; |
HW_{i} | - | the width of the high shadow edge of the distributed array on the i-th dimension; |
ArrayPtr | - | pointer to the local part of the array (that is the part allocated in the memory together with the shadow edges); |
TypeSize | - | the size of the element of the distributed array (in bytes). |
Then:
ArrayHeader[n] = | { | ArrayHeader[n] + (
(long)ArrayPtr - (long)BasePtr )/TypeSize if BasePtr ¹ NULL, ArrayHeader[n]*TypeSize+ (long)ArrayPtr if BasePtr = NULL. |
In crtda_ function call with non-zero *ExtHdrSignPtr parameter the distributed array header can be statically specified as
long ArrayHeader[2*n+2]; /* extended header */
Fist (n+1) words of extended header coincide with standard header, considered above. The words from (n+2) to (2*n+1) must be specified by user's program before the array allocating in the memory by align_ (malign) function (it is assumed, that these words will contain low values of distributed array indexes, used in Fortran). When mapping or remapping the array on processor system Run-Time System calculates (n+1)-th word of extended header as
Note, that Run-Time System allocates (during the execution of the functions align_ and realn_) the local part of the distributed array in the memory in so manner that the difference (ArrayPtr – BasePtr) must be multiple of the TypeSize.
The access to the elements of the local part of the distributed array is performed either directly by its header and the base pointer, or by the functions rlocel_, wlocel_ and clocel_ (that is less effective) described below.
13.4.1 Requesting if array element is allocated in local part of distributed array
long tstelm_( | long long |
ArrayHeader[], IndexArray[] ); |
ArrayHeader | - | header of distributed array. | ||
IndexArray | - | IndexArray[i] is the index of the element of the distributed array by (i+1)-th dimension. |
The number of specified indexes has to be equal to the rank of the distributed array.
The function returns non-zero value, if specified element is allocated in the local part of the specified distributed array, and zero in other case.
13.4.2 Requesting initial and last index values of local part of distributed array
long locind_( | long long long |
ArrayHeader[], InitIndexArray[], LastIndexArray[] ); |
ArrayHeader | - | header of distributed array. | ||
InitIndexArray | - | array, which i-th element will be assigned by initial value of the index of the local part of distributed array on (i+1)-th dimension. | ||
LastIndexArray | - | array, which i-th element will be assigned by last value of the index of the local part of the distributed array on (i+1)-th dimension. |
The sizes of the arrays InitIndexArray and LastIndexArray must be equal to the distributed array rank.
The function returns non-zero value, if the specified array has a local part and zero in other case. If the array has no the local part, the arrays InitIndexArray and LastIndexArray are not updated.
13.4.3 Reading element of local part of distributed array
long rlocel_( | long long void |
ArrayHeader[], IndexArray[], *BufferPtr ); |
ArrayHeader | - | a header of the distributed array. | ||
IndexArray | - | array, which i-th element is index value of read element of the distributed array on (i+1)-th dimension. | ||
BufferPtr | - | pointer to the memory area, the element will be written in. |
The function can be executed successfully only by the processor, in whose memory the specified element is allocated.
The size of distributed array element in bytes (a number of read bytes) is returned.
13.4.4 Assigning value to element of local part of distributed array
long wlocel_( | void long long |
*BufferPtr, ArrayHeader[], IndexArray[] ); |
ArrayHeader | - | a header of the distributed array. | ||
IndexArray | - | array, which i-th element is index value of modified element of the distributed array on (i+1)-th dimension. | ||
BufferPtr | - | pointer to the memory area, where the value is located. |
The function can be executed successfully only by the processor, in whose memory the specified element is located.
The size of distributed array element in bytes is returned.
13.4.5 Coping element of local part of distributed array to element of local part of other distributed array
long clocel_( | long long long long |
FromArrayHeader[], ÆromIndexArray[], ToArrayHeader[], ToIndexArray[] ); |
FromArrayHeader | - | header of read distributed array. | ||
FromIndexArray | - | ditributed array, which i-th element is index value of read element of the distributed array on (i+1)-th dimension. | ||
ToArrayHeader | - | header of the other distributed array, the element will be assigned by the read value. | ||
ToIndexArray | - | array, which j-th element is index value of updated element of the distributed array on (i+1)-th dimension. |
The function can be executed successfully only by the processor, in whose memory the read and written elements are allocated. The types of read and written elements must be the same.
The number of copied bytes is returned.
13.4.6 Requesting address of element of local part of distributed array
char *GetLocElmAddr( | long long |
ArrayHeader[], IndexArray[] ); |
ArrayHeader | - | header of the distributed array. | ||
IndexArray | - | array, which i-th element is index value of element of distributed array on (i+1)-th dimension. |
The function can be executed successfully only by the processor, in whose memory the specified element is allocated.
The pointer to the first byte of the element is returned.
13.5 Macros to access elements of local part of distributed array of rank from1 to 7
The following macros to access the elements from the local part of the distributed array with rank from 1 to 7 can be used in C ðrograms:
<DAElmType> DAElm<Rank> ( | long long ..…. long |
ArrayHeader[], <DAElmType>, Index_{1}, .……………….. Index_{<Rank> }); |
ArrayHeader | - | header of the distributed array. | ||
Rank | - | the rank of the distributed array. | ||
DAElmType | - | the type of the element of the distributed array. | ||
Index_{i} | - | index value of the requested element on the i-th dimension of the distributed array. |
Each of these macros is L-value in the C language.
The access to local part of distributed array by means of macros is more effective, then the access by the functions, described in section 13.4.
It is assumed that the array with the header ArrayHeader was created with the base pointer equal to NULL.
13.6 Sequential requesting index values of distributed array elements
long setind_ ( | long long long long |
ArrayHeader[], InitIndexArray[], LastIndexArray[], StepArray ); |
ArrayHeader | - | distributed array header. | ||
InitIndexArray | - | array, which i-th element is initial value of set index of the distributed array element for (i+1)-th dimension. | ||
LastIndexArray | - | array, which i-th element is last value of set index of the distributed array element for (i+1)-th dimension. | ||
StepArray | - | array, which i-th element is index step of (i+1)-th dimension when sequential requesting of indexes is done. |
The function setind_ sets initial and last values and steps of indexes of the distributed array elements for the following requesting and updating of indexes by the function getind_ considered below.
For full coverage of the distributed array dimension without requesting size of the object for the specified dimension (see section 17.2) the initial value of the index must be equal to -1. In this case it is considered that initial value of the index is equal to 0, the step is equal to 1 and last value is equal to the size of the array for given dimension minus 1.
The function returns zero.
long getind_ ( | long long |
ArrayHeader[], NextIndexArray[] ); |
ArrayHeader | - | distributed array header. | ||
NextIndexArray | - | array, which i-th element is assigned by the next index value for the (i+1)-th dimension. |
The function getind_ is intended for sequential requesting next values of the distributed array element indexes. When the function is called first time, the indexes, set by the function setind_ are returned. After writing to the array NextIndexArray the index values are updated according to steps, specified by the function setind_. The index of dimension with larger number is changed faster then the index of the dimension with lesser number (according to C language rules).
Non-zero value is returned, if next indexes are requested. Zero value is returned, if subset of distributed array elements, specified by the function setind_ is exhausted.