C-DVM compiler. Detailed design

C-DVM compiler
Detailed design
* 22 June 2000 *

Contents

1.1 Running Compiler

2.1 Data Structures of the Compiler
2.2 The Linked Memory
2.3 The Scanner and the String Memory
2.4 Tokens and their symbolic names

2.4.1 Terminals
2.4.2 Delimiters and operators
2.4.3 Keywords
2.4.4 C-DVM keywords and identifiers

2.5 The Parser
2.6 The parsing tree structure

2.6.1 Terminals, lists and braces
2.6.2 Declarations (of C)
2.6.3 Statements (of C)
2.6.4 Expressions (of C)
2.6.5 Directives (of C-DVM)
2.6.6 Clauses and subdirectives (of C-DVM)

2.7 The Tree Transformation
2.8 The Unparser

3 Compilation of C-DVM constructs

3.1 Data distribution

3.1.1 DISTRIBUTE directive
3.1.2 GENBLOCK distribution format
3.1.3 ONTO clause
3.1.4 REDISTRIBUTE directive
3.1.5 ALIGN directive
3.1.6 REALIGN directive
3.1.7 TEMPLATE clause
3.1.8 CREATE_TEMPLATE directive

3.2 Distribution of computations (loops and tasks)

3.2.1 PARALLEL directive
3.2.2 ACROSS clause
3.2.3 PROCESSORS directive and NUMBER_OF_PROCESSORS() function
3.2.4 TASK directive
3.2.5 MAP directive
3.2.6 TASK_REGION directive
3.2.7 ON-block construct
3.2.8 ON-loop construct

3.3 Shadow edges

3.3.1 SHADOW clause
3.3.2 SHADOW_RENEW clause
3.3.3 SHADOW_GROUP directive
3.3.4 CREATE_SHADOW_GROUP directive
3.3.5 SHADOW_START directive
3.3.6 SHADOW_START clause
3.3.7 SHADOW_WAIT directive
3.3.8 SHADOW_WAIT clause

3.4 Remote access

3.4.1 REMOTE_ACCESS directive and clause
3.4.2 REMOTE_GROUP directive
3.4.3 PREFETCH directive
3.4.4 RESET directive
3.4.5 Remote references

3.5 Reduction operations

3.5.1 REDUCTION_GROUP directive
3.5.2 REDUCTION clause
3.5.3 Reduction variables and operations
3.5.4 REDUCTION_START directive
3.5.5 REDUCTION_WAIT directive

3.6 Implicit constructions

3.6.1 Creation and deletion of distributed arrays
3.6.2 Static distributed arrays
3.6.3 References to distributed data
3.6.4 Own computation
3.6.5 Initialization and completion of parallel execution
3.6.6 Input-output functions

3.7 Debugging extensions

3.7.1 Performance analyzer. Loops
3.7.2 Performance analyzer. INTERVAL directive
3.7.3 Debugger. Data tracing
3.7.4 Debugger. Computation tracing
3.7.5 Sequential code

1 Purpose of the Compiler

C-DVM is the C language extended by special annotations for specifying parallel execution of a program. These annotations are called DVM-directives. C-DVM compiler translates an annotated C-DVM program to a SPMD stile program that contains calls to Run-Time Library (RTL).

Besides "pure" parallel code the compiler can produce an "extended" debugging code to use features of the performance analyzer (PPPA) and debugger, and also a "sequential" code (i.e. without RTL calls) with such debugging extensions.

1.1 Running Compiler

For portability the compiler is written in ANSI-C and runs in command line mode (under OS UNIX and WINDOWS) using the following command:

c_dvm [ options ] [ -o outfile ] infile

where:

infile - an obligatory parameter; a source C-DVM file; it is suppoused that a program is valid C program, compilied by any usual C compiler.

-o outfile -- an optional parameter: the name of output file with converted program; it is cdvm_out.c by default;

options - one or more from the following options:

-s - produce sequential program (no DVM, trace or statistics only); a parallel program is generated by default;

tracing modes:

-d1   - tracing distirbuted data updates,
-d2   - tracing all accesses to distributed data,
-d3   - tracing all data updates,
-d4   - tracing all accesses to all data.

trace accumulation modes:

-e1   - parallel loops and surrounding sequential loops,
-e2   - parallel loops and user defined INTERVALs;
-e3   - e1 + e2,
-e4   - e3 + all sequential loops.

others:

-v   - Verbose mode, output messages to the screen,
-w   - enable all Warnings,
-w-   - disable all Warnings,
-xNNN   - eXtent of internal tables (-x16 by default).

2 The Structure of the Compiler

The main program of the Compiler performs the following steps:

gets and verifies command line parameters (the function cmd_line());
allocates variable-size data (buffers) and initializes lexical and syntactical tables;
opens input and output files;
invokes a scanner to read the source program and to build a token stream;
invokes a parser to read the token stream and to build a parsing tree; the parser in its turn invokes semantic functions to analize the source program and to build required output constructions;
if the parser successfully built the tree (syntax errors were not detected) the main program invokes (tree) transformer to substitute source nodes by output ones;
then an unparser generates an output file.

2.1 Data Structures of the Compiler

Compilation is made upon an internal representation of a program.The main data structures used for the internal representation are the following:

string memory for character representation of tokens (LX);
linked memory in which a parsing tree is built (NDs);
hash tables for fast search of tokens and nodes (HT0, HT1).

Sizes of these structures determines the maximal size of a source program. If the size is not enough it can be "scaled" by command line parameter -xNNN.

The hash tables are accessed by the function HTfind(item,len,ht,ht_cmp,ht_new). It searches an item of length len in a hash table ht using ht_cmp function to compare items and ht_new function to write new item to the table. It returns the cell number in the hash table where the item is kept. The cell contents is interpreted by a calling function and ht_cmp and ht_new functions.

2.2 The Linked Memory

The linked memory is used to store the following data:

syntactic tables of the parser;
the token stream;
the parsing tree;
a stack for the parser and semantic functions;
scopes and declarations lists.

Each node of the linked memory has the following four fields:

C - code of the node, usually it is an internal number of C statement or expression, or DVM-directive;
A, B - references to subtrees of operands or components; they may be equal to zero for optional subconstructs or unary operations; sometimes these fields are not references to subtrees, e.g. for a node, representing a token, it contains a reference to the string memory;
D - list of the node attributes; the main "attribute" of node is NEW attribute - an output form of the corresponding source construction.

The linked memory is implemented by the following functions:

NDinit() cleans the linked memory buffer and creates the free node list;
NDD(c,a,b,d) creates and filles a new node;
mk(c,a,b) creates or finds a node (c,a,b,?);
NDdel(N) deletes the node N;
wN(N,c,a,b) fills the fields C, A and B fields of the node N.
A, B, C and D(N) read the corresponding field of the node N;
wA, wB, wC and wD(N,x) write a value x to the corresponding field of the node N;
NDget(N,path) gets subtree by the "path"; for example NDget(N, 122) means C(A(A(N))).

The attribute list is supported by the functions:

set(c,N,V) sets the value V to the c attribute of the node N;
get(c,N) gets the value of c attribute of the node N;
del(c,N) deletes c attribute of the node N.

Semantic stack is implemented as a B-linked list in the linked memory with the root Stack and functions SSpush and SSpop. There are functions walk(N,sfun) and walkR(N) which implement standard walk throught the parsing tree invoking a given semantic function sfun for every visited node.

To handle declarations a stack of visibility scopes (as a list in the linked memory) and lists of currently defined variables for every scope are built. These structures are supported by the following functions:

openScope() entering new scope, pushes scope stack;
closeScope() leaving a scope, popes scope stack;
addDecl(N,spec,dvm) stores a declaration in the current scope;
Decl(lx) returns 0 if lx is not declared;
whatis(N) finds and unpacks declaration and DVM-directive (if it exists) into global variables for use by semantic functions;
and some others.

2.3 The Scanner and the String Memory

The main scanner function is scan() function. On every invocation it:

skips blank characters; in particular, comments;
reads next line if it is necessary;
determines the type of next token by the first character;
moves over characters belonging to the next token;
finds it in (or write to) the token table;
sets global variables LXcode and LXval.

LXcode is the code of the current token, namely: for C and C-DVM keywords and C delimiters it is their internal number; and for identifiers or constants it is their lexical category number. In the latter case LXvalue is an internal number of the identifier or constant, i.e. a reference to a node with the code TOKEN. String representation of identifiers and constants are kept in the string memory and used in generation step. For parser they are represented by their internal numbers. Internal number is assigned to an identifier at its first occurence. The scanner looks for the current token in the token table (hash function is used), and adds new token to it if necessary.

Other scanner functions:

LXinit() initializes scanner tables;
LX_get(t) selects next token skipping characters with type t (defined in table LXcd: e.g. decimal digit, hexadecimal digit, letter);
comment() skips a comment;
LXfind(lxcode) finds the current token in the token table; writes new one;
LXW(token) gets character representation of a token by its code;
Token(s) makes a TOKEN node for string s;
LxiS(s) makes an LXI (identifier) node for string s;
Lxi(token) makes an LXI (identifier) node for a token;
Lxn(n) makes an LXN (number) node for a number n.

2.4 Tokens and their symbolic names

2.4.1 Terminals

LXEOF    "LXEOF"  -- end of file
TOKEN    "TOKEN"  -- token in the input stream
DLM      "DLM"    -- delimiter in the input stream
KWD      "KWD"    -- keyword in the input stream
LXX      "LXX"    -- undelimited list item and tail
LXI      "IDENT"  -- identifier
LXN      "NUMBER" -- numeric constant
LXC      "LXC"    -- character constant
LXR      "REAL"   -- floating point constant
LXS      "STRING" -- string constant
CPP      "CPP"    -- preprocessor directive as string

2.4.2 Delimiters and operators

SEMIC                ";"
COMMA                ","
LBA                  "("
LBS                  "{"
LBK                  "["
RBA                  ")"
RBK                  "]"
RBS                  "}"
POINT                "."
QUEST                "?"
COLON                ":"
DIV                  "/"
ADIV                 "/="
COMM                 "/*"    -- C comment beginning
CPPCOMM              "//"    -- C++ comment beginning
MOD                  "%"
AMOD                 "%="
ADD                  "+"
AADD                 "+="
INC                  "++"
SUB                  "-"
ASUB                 "-="
DEC                  "--"
ARROW                "->"
MUL                  "*"
AMUL                 "*="
ASSIGN               "="
EQU                  "=="
LT                   "<"
LE                   "<="
LSH                  "<<"
ALSH                 "<<="
GT                   ">"
GE                   ">="
RSH                  ">>"
ARSH                 ">>="
BNOT                 "~"
NOT                  "!"
NEQ                  "!="
BAND                 "&"
AAND                 "&="
AND                  "&&"
BOR                  "|"
AOR                  "|="
OR                   "||"
BXOR                 "^"
AXOR                 "^="

2.4.3 Keywords

RETURN               "return"
IF                   "if"
ELSE                 "else"
WHILE                "while"
DO                   "do"
FOR                  "for"
BREAK                "break"
CONTINUE             "continue"
SWITCH               "switch"
CASE                 "case"
DEFAULT              "default"
GOTO                 "goto"
SIZEOF               "sizeof"
TYPEDEF              "typedef"
EXTERN               "extern"
STATIC               "static"
REGISTER             "register"
AUTO                 "auto"
CONST                "const"
INT                  "int"
SHORT                "short"
LONG                 "long"
VOID                 "void"
CHAR                 "char"
SIGNED               "signed"
UNSIGNED             "unsigned"
FLOAT                "float"
DOUBLE               "double"
STRUCT               "struct"
UNION                "union"
ENUM                 "enum"

2.4.4 C-DVM keywords and identifiers

LX_DVM               "DVM"
PROCESSORS           "PROCESSORS"
DISTRIBUTE           "DISTRIBUTE"
BLOCK                "BLOCK"
GENBLOCK             "GENBLOCK"
ONTO                 "ONTO"
ALIGN                "ALIGN"
WITH                 "WITH"
TEMPLATE             "TEMPLATE"
SHADOW               "SHADOW"
LX_SG                "SHADOW_GROUP"
LX_RG                "REDUCTION_GROUP"
LX_RMG               "REMOTE_GROUP"
TASK                 "TASK"
REDISTRIBUTE         "REDISTRIBUTE"
REALIGN              "REALIGN"
NEW                  "NEW"
LX_CRTEMP            "CREATE_TEMPLATE"
PARALLEL             "PARALLEL"
ON                   "ON"
REDUCTION            "REDUCTION"
SHRENEW              "SHADOW_RENEW"
ACROSS               "ACROSS"
LX_CRSG              "CREATE_SHADOW_GROUP"
CORNER               "CORNER"
SHSTART              "SHADOW_START"
SHWAIT               "SHADOW_WAIT"
SUM                  "SUM"
PROD                 "PRODUCT"
MAX                  "MAX"
MIN                  "MIN"
LX_OR                "OR"
LX_AND               "AND"
MAXLOC               "MAXLOC"
MINLOC               "MINLOC"
RSTART               "REDUCTION_START"
RWAIT                "REDUCTION_WAIT"
REMOTE               "REMOTE_ACCESS"
INDIRECT             "INDIRECT_ACCESS"
PREFETCH             "PREFETCH"
RESET                "RESET"
MAP                  "MAP"
LX_TASKREG           "TASK_REGION"
INTERVAL             "INTERVAL"
DEBUG                "DEBUG"
LX_NUMBER_OF_PROC    "NUMBER_OF_PROCESSORS"
LX_FOR               "FOR"
LX_DO                "DO"
LX_main              "main"
LX_malloc            "malloc"
LX_free              "free"
optD0                "d0"
optD1                "d1"
optD2                "d2"
optD3                "d3"
optD4                "d4"
optE0                "e0"
optE1                "e1"
optE2                "e2"
optE3                "e3"
optE4                "e4"

2.5 The Parser

Purpose of the parser is evident:

to analyze context-free structure of the program;
to report errors, if they were encountered;
to build the parsing tree;
to invoke a semantic functions for built nodes.

The method of parsing is straitforward top-down descent with backtracking in not numerous doubtful cases. The structure of resulting tree is described in the following section.

The parser is implemented as a recursive function Parse, which uses an internal representation of the source syntax stored in the linked memory as a syntax tree. The syntax tree is built by the function STXinit using functions:

STXterm for a terminal element creation;
STXelem for a non-terminal creation;
STXrule for a syntax rule;
STXgroup for a group of syntax rules;
STXgrlist for a list of the groups.

The parser also uses the following functions:

ParsErr reports an error;
cmpLX compares the current token with terminal element of syntax rule;
SkipNext tests "lookforward" rules and skips the next rule if necessary;
ISWF is the main semantic function; it is invoked twice for every node.

2.6 The parsing tree structure

The following table describes the structure of the parsing tree in such a manner. For all node codes (on the left) a fragment of source code where A and B denote subconstructs corresponding to subtrees, hanging on the fields A and B respectively, is presented. AB referes to the subtree hanging on the field A of the node hanging on the field B of the initial node. Where partitioning of a source code to subconstructs is not clear it is commented. Here only "free" structure is described, i.e. dependences and constrains following from syntax are not mentioned.

2.6.1 Terminals, lists and braces

TOKEN:       -- A is the position in the string memory
DLM:         -- A is the internal code of delimiter
KWD:         -- A is the internal code of keyword
LXI:         -- A is reference to a token node
LXN:         -- A is reference to a token node
LXC:         -- A is reference to a token node
LXR:         -- A is reference to a token node
LXS:         -- A is reference to a token node
CPP:         -- A is preprocessor directive as a string
COMMA:   A , B           -- comma-list
LXX:     A [B-tail]      -- B-list
XXlist:  A [B-tail]      -- B-list
XXslist: A [B-tail]      -- B-list
LBA:     "(" [A] ")"
LBS:     "{" [A ] "}"    -- compound statement
LBK:     [A] "[" [B] "]" -- indexing

2.6.2 Declarations (of C)

LXX:     [AA-DVM_directive] BA-decl [ B-next ]
XXdecl:  [AA-storage_class] BA-type B-declarator-list
XXtype:  [AA-storage_class] BA-type B-abstract-declarator

Memory class (one or more B-linked nodes):

STATIC:     static     [B]
AUTO:       auto       [B]
REGISTER:   register   [B]
EXTERN:     extern     [B]
TYPEDEF:    typedef    [B]
CONST:      const      [B]

Types (simple types are B-linked):

STRUCT:     struct     [A] ["{" AB-fields "}"]
UNION:      union      [A] ["{" AB-fields "}"]
ENUM:       enum       [A] ["{" B "}"]
VOID:       void
TYPEDEF:    B - typedef-name
UNSIGNED:   unsigned   [B]
SIGNED:     signed     [B]
CHAR:       char       [B]
SHORT:      short      [B]
LONG:       long       [B]
INT:        int        [B]
FLOAT:      float      [B]
DOUBLE:     double     [B]

Declarators:

0:           AA-declarator [BA-initializer] B-list-tail
LXaster:     * [B-"const"] A
LXfun:       A "(" [B] ")"

Initializers:

ASS:        =  A
COLON:      :  A

Declarator list tail:

XXbody:     "{" [A] "}"  [B -- ";"]
XXclist:    , A
SEMIC:      ;       -- end of list

2.6.3 Statements (of C)

XXoper:     A
IF:         if "(" AA ")" BA [else B]
WHILE:      while "(" A ")"  B
DO:         do B  while "(" A ")" ;
FOR:        for "(" AA ; ABA ; BBA ")"  B
SWITCH:     switch  "(" A ")" "{" B "}"
GOTO:       goto  A ;
LXskip:     ;
RETURN:     return  [A] ;
LX_FOR:     FOR  "("  AA , BA ")"  B
LX_DO:      DO   "("  AA-variable , BA ")"  B
CONTINUE:   continue ;
BREAK:      break ;
COLON:      A :
XXexpr:     A ;
DEFAULT:    default:  [B]
CASE:       case A : [B]

2.6.4 Expressions (of C)

LBK:        A "[" B "]"
POINT:      A .   B
ARROW:      A ->  B A
SS:         A =  B
AADD:       A += B
ASUB:       A -= B
AMUL:       A *= B
ADIV:       A /= B
AMOD:       A %= B
ARSH:       A >>= B
ALSH:       A <<= B
AAND:       A &= B
AXOR:       A ^= B
AOR:        A |= B
QUEST:      A ? AB : BB
OR:         A || B
AND:        A && B
BOR:        A | B
BXOR:       A ^ B
BAND:       A & B
EQU:        A == B
NEQ:        A != B
GT:         A >  B
GE:         A >= B
LT:         A <  B
LE:         A <= B
RSH:        A >> B
LSH:        A << B
ADD:        [A] + B
SUB:        [A] - B
MUL:        A * B
DIV:        A / B
MOD:        A % B
LXinca:     A ++
LXdeca:     A --
NOT:        ! B
BNOT:       ~ B
SUB:        - B
ADD:        + B
INC:        ++ B
DEC:        -- B
SIZEOF:     sizeof ( B )
LXaddr:     &   B
LXcont:     *   B
LXfun:      A  "("  [B] ")"
LXcast:     "(" A ")" B

2.6.5 Directives (of C-DVM)

LX_DVM:      DVM "(" [B-"*"] A ")"

DISTRIBUTE:  DISTRIBUTE [ AA [ ONTO BA ] ] [ B-subdirs ]
ALIGN:       ALIGN [ AA WITH BA ] [ B-subdirs ]
LX_SG:       SHADOW_GROUP
LX_RG:       REDUCTION_GROUP
LX_RMG:      REMOTE_GROUP
TASK:        TASK
PROCESSORS:  PROCESSORS
REDISTRIBUTE: REDISTRIBUTE AA [ ONTO BA ] [ B-"NEW" ]
REALIGN:     REALIGN AA WITH BA [ B-"NEW" ]
LX_CRTEMP:   CREATE_TEMPLATE  A
MAP:         MAP A ONTO B
LX_CRSG:     CREATE_SHADOW_GROUP A ":" B-Renewees
SHSTART:     SHADOW_START A
SHWAIT:      SHADOW_WAIT A
RSTART:      REDUCTION_START A
RWAIT:       REDUCTION_WAIT A
PREFETCH:    PREFETCH A
RESET:       RESET A
PARALLEL:    PARALLEL AA ON BA [ B-clauses ]
REMOTE:      REMOTE_ACCESS [ A ":" ] B
LX_TASKREG   TASK_REGION A [ B-clause ]
ON:          ON A
INTERVAL:    INTERVAL [ A ]
DEBUG:       DEBUG A-number B-Debug_modes
    Debug_modes are B-linked nodes:

optD0:       -d0 [ B ]
optD1:       -d1 [ B ]
optD2:       -d2 [ B ]
optD3:       -d3 [ B ]
optD4:       -d4 [ B ]
optE0:       -e0 [ B ]
optE1:       -e1 [ B ]
optE2:       -e2 [ B ]
optE3:       -e3 [ B ]
optE4:       -e4 [ B ]
    Renewees subtree is an LXX-list of:

DVMshad:     A-DVMshw [ B-"CORNER" ]
DVMshw:      [ A ] "[" AB [ ":" BB ] "]"

2.6.6 Clauses and subdirectives (of C-DVM)

TEMPLATE:    ";" TEMPLATE A
SHADOW:      ";" SHADOW A-DVMshw
SHRENEW:     ";" SHADOW_RENEW B-Renewees
SHSTART:     ";" SHADOW_START A
SHWAIT:      ";" SHADOW_WAIT A
ACROSS:      ";" ACROSS B-Renewees
REDUCTION:   ";" REDUCTION [ A ":" ] B-Red_ops
REMOTE:      ":" REMOTE_ACCESS [ A ":" ] B
    Red_ops subtree is an LXX-list of:

SUM:         SUM   "(" A  ")"
PROD:        PROD  "(" A  ")"
MAX:         MAX   "(" A ")"
MIN:         MIN   "(" A  ")"
LX_AND:      AND   "(" A  ")"
LX_OR:       OR    "(" A  ")"
MAXLOC:      MAXLOC  "(" A, B ")"
MINLOC:      MINLOC  "(" A, B ")"

2.7 The Tree Transformation

The tree transformation is performed in two steps:

while parsing new nodes are built and stored as attributes of nodes of the parsing tree;
after parsing the function walk is invoked (to walk through the parsing tree) with semantic function pSUBST substituting (in botton-up manner) new nodes instead of old ones.

2.8 The Unparser

The function UnPars traverses the tree and restores external text representation of the program. As this file is only temporary, formatting is relatively simple. Note that output program is written in terms of C-DVM macros, but not directly in terms of RTL calls. The final form of the macros may be found in the file cdvm_c.h.

3 Compilation of C-DVM constructs ==>