Fortran-DVM Compiler |
Contents
1
Compiler role
2
Command line format
3 The general scheme of compiler
3.1 Parsing
3.2 Transforming parse tree
3.3 Generating code in Fortran 77
3.4 Generating code in HPF
5 Detailed description of compiler modules
5.1.1 Distributed array creation and remapping
5.1.2 Distributed array referencing
5.1.3 Parallel loop
5.2 Translating input/output statements (module io.cpp)
5.3 Restructuring parse tree (module stmt.cpp)
5.4 Translating HPF-DVM constructs (module hpf.cpp)
5.4.1 Processing distributed array references in HPF-DVM
5.4.2 INDEPENDENT loop
This report presents the detailed description of the Fortran-DVM (FDVM) compiler implementation. The basic data structures, the control scheme, and the functions of compiler modules are considered.
Fortran DVM (FDVM) language is an extension of the Fortran 77 language for parallel programming. The extension is implemented as special comments (directives) that annotate sequential program in Fortran 77.
The input to the compiler is source code in Fortran DVM or HPF-DVM language. The compiler produces the following output programs.
The format of the FDVM compiler command line is illustrated below:
dvm fdv [ <options> ] <file-name>
Source program is placed in input file <file-name>.fdv or <file-name>.hpf.
On the command-line <options> are the compiler options:
-o file | Place output in the file file; |
-s | Produce sequential program; |
-p | Produce parallel program; |
-hpf1 | Produce HPF1 program; |
-hpf2 | Produce HPF2 program; |
-v | Display the invocations of compiler phases and version number; |
-w | Display all the warning messages; |
-Idir | Add directory dir to the list of directories searched for include files; |
-bindk | Specifies the compatibility of data types between Fortran and C, k is an integer pointing to compatibility table number; |
-dleveld[:fr-list] | Produce additional code for
the program debugging, leveld specifies debug level, fr-list is fragment number list; |
-elevele[:fr-list] | Produce additional code for
program performance analyzing, levele specifies level of performance debug. |
3 The general scheme of compiler
Sage++ system is used as a tool for designing FDVM compiler.
Sage++ is an object-oriented toolkit for building program transformation systems for Fortran 77, Fortran 90, C and C++ languages. It is designed as an open C++ class library that provides a set of parsers, a structured parse tree, a symbol and type tables for a user. The heart of the system is a set of functions that allow restructuring the parse tree and a mechanism (called unparsing) for generating new source code from the restructured internal form.
The FDVM compiler consists of four components:
The Fortran parser of Sage++ based on the GNU Bison version of YACC is extended to add language extensions (DVM directives) to Fortran system. It consists of the following modules:
ftn.gram | - grammar rules for Fortran |
fdvm.gram | - grammar rules for Fortran DVM |
lexfdvm.c | - lexical analyzer |
tag | - variant tag list |
tokens | - lexeme list |
gram1.tab.c | - parser generated by Bison |
cftn.c | - main routine (calls parser, opens and closes the files that are needed) |
init.c | - initialization routines |
stat.c | - routines for creating internal form of statements (bif node of parse tree) |
errors.c | - printing error messages |
sym.c | -Symbol table routines |
types.c | - routines to handle the variable declarations |
lists.c | - routines to build the lists |
misc.c | - miscellaneous help routines |
hash.c | - hash table routines |
The parser reads the source file, checks the concrete syntax, constructs a parse tree, and writes its internal representation in .dep file.
Second phase of compiling involves analyzing and restructuring internal representation of FDVM program. A DVM directive is substituted for a sequence of Lib-DVM function calls. Afterwards new source code is generated from restructured internal form.
Back-end of the compiler is written in C++ language using Sage++ class library.
The Sage++ library is organized as a class hierarchy that provides access to the parse tree, symbol table and type table for each file in an application project. There are five basic families of classes in the library: Project and Files, Statements, Expressions, Symbols, and Types.
Project and Files correspond to source files. Statements correspond to the basic source statements in Fortran 77 and DVM directives. Expressions are contained within statements. Symbols are the basic user defined identifiers. Types are associated with each identifier and expression.
The file libSage++.h contains all the class definitions.
Seven modules compound the translator:
dvm.cpp | - analyzing and translating FDVM constructs |
funcall.cpp | - generating Lib-DVM library calls |
stmt.cpp | - restructuring parse tree |
io.cpp | - translating I/O statements |
debug.cpp | - support of debugging mode |
help.cpp | - miscellaneous help routines |
hpf.cpp | - translating HPF-DVM constructs |
3.3 Generating code in Fortran 77
Generating new source code from the restructured internal form is implemented by the File class member function (unparse( )) of Sage++ class library.
When the source FDVM program is converted in HPF program the following routines and tables are used for unparsing:
unparse_hpf.c | - routines for generation HPF code |
low_hpf.c | - low-level routines for unparsing |
unparse.hpf | - table driving the generation of HPF2 code |
unparse1.hpf | - table driving the generation of HPF1 code |
The definitions of data structures of internal representation are contained in the files:
The structures of parse tree nodes for a statement and an expression are given in Fig. 4.1 and Fig.4.2 correspondingly. The Fig. 4.4 illustrates the fragment of parse tree.
The Fig.4.3 presents the Symbol and Type Table entries.
variant tag |
identification tag |
index |
global line number |
local line number |
declaration specifier |
pointer to the label |
pointer to the next statement node |
pointer to the source filename |
pointer to the control parent node |
property list |
list of nodes (list of procedures) |
pointer to the comment |
symbol table entry |
L-value expr tree |
R-value expr tree |
spare expr tree |
do-label (used by do) |
null |
null |
null |
null |
Fig. 4.1. Parse tree node representing a statement (bif node).
variant tag |
identification tag |
pointer to the next node (by allocation order) |
pointer to the Type table element |
constant value |
pointer to the Symbol table element |
L-value expr tree |
R-value expr tree |
Fig. 4.2. Parse tree node representing an expression (low level node).
variant tag | variant tag |
identification tag | identification tag |
length | identifier |
spare field | Hash table entry |
spare field | special list |
use-definition chain | special list |
base type entry (for array) | special list |
ranges (for array) | next Symbol table entry |
Type table entry | |
Scope | |
use-definition chain | |
attributes (mask) | |
do-variable flag | |
parser used | |
pointer to value (for constants) | |
special fields |
Fig.4.3. Type and Symbol table entries
Fig. 4.4. Internal representation of the statement a = b + c.