VIRGINIA TECH OF DEPARTMENT OF COMPUTER SCIENCE

COMPILERS AND TRANSLATORS

LEXICAL ANALYSIS

SYNTACTIC ANALYSIS

SEMANTIC ANALYSIS

CODE GENERATION

OPTIMIZATION


SYMBOL TABLE

SYNTACTIC ANALYSIS

  • Objectives
    • To study the various styles of syntactic analyzers in use in compilers,
    • To recognize the different capabilities of syntactic analyzers, and
    • To apply those capabilities to the efficient development of a composite syntactic analyzer.
  • Notes
  • While we generally present a compiler of consisting of five parts, one of which is the syntactic analyzer, through which the source code is transformed to the object code step-by-step, in practice this model is not accurate. Firstly it is rare that the whole code is transformed completely at each step, but instead each processor will pass on to its successor a portion of the recognized code to be further processed before the remainder of the program is translated. In this manner it appears that the compiler is a "one-pass" system, and the retained information about the overall program can be minimized. For example, functions and procedures can be treated as separate units by the compiler, and even within the program units blocks can be shown to encompass a scope that requires only a minimal amount of information to be retained across its boundaries. Apart from storage references, statements in procedural languages are almost independent of each other and can be compiled separately.

    Similarly in that element of the compiler which we identify as the syntactic analyzer, the actual processor may be made up of a collection of analyzers each attuned to a particular language element. A common structure of a syntactic analyzer is as follows:

    • A recursive descent recogizer that works over the upper level of the language and which calls on other analyzers to process special components. This is a top-down, predictive analyzer that does the "coarse" work of analysis and recognition.
    • An expression analyzer, commonly based on the Samelson-Bauer algorithm, that uses a push-down automata to analyze infix expressions.
    • A LR parser (SALR, LALR, etc.) with a collection of small tables each capable of the analysis of specific components of the language that would benefit from bottom-up analysis. Each language component is regarded as a simple language of its own, well bounded by delimiters, and capable of fast analysis or recognition.

    Syntactic analyzers can be classified into two types depending on their approach to analysis. The process of syntactic analysis can be viewed as the process of constructing the syntactic tree that exists between a root symbol and an instance of the language. A top-down analyzer starts from the root symbol of the grammar and successively predicts (usually in a left-to-right manner) what its constituent parts should be. Working down the left-hand edge of the tree, a top-down analyzer repeatedly makes predictions until a terminal in predicted. This must then be compared to the leftmost element of the string under analysis. If found to match then the analyzer has satisfied its first goal and can then backtrack to find a successor goal to match to the rest of the string. If there is not a match then the analyzer must search for alternatives while backtracking to select an alternative goal. Top-down processors can not be constructed for left-recursive grammars and thus their applicability can be somewhat limited. The most common top-down analyzer is the recursive descent processor; a second useful system is that of Cheatham and Sattley.

    Bottom-up analyzers work on the problem from the string to be analyzed and attempt to construct a syntactic tree by recognizing the right-hand sides of the grammar rules and thus reducing those portions of the string to their non-terminals. This process is repeated until the only remaining non-terminal is the root symbol of the grammar. Knuth's LR parsers are the classic bottom-up processors, though simplifications can often be made to the language to improve the efficiency of the processor. Being table driven, the LR tables can grow very large when it is attempted to analyze the whole language by this method. On the other hand, if components of the language are split off for analysis, such as by a recursive descent system, their corresponding tables can be easily manageable. LR parsing is akin to FSMs or Push Down Automata, being a state transition system; once the table have been constructed LR parsing can be very fast.

    Recursive Descent

    Cheatham/Sattley Method

    Samelson/Bauer Expression Analysis

    LR Parsing

    LR parsing is a bottom-up method of parsing that starts with the input string, replaces right-hand sides of rules by left-hand sides until the start symbol of the grammar has been derived, thus generating a rightmost derivation in reverse order. Since right hand sides of rules usually contain multiple symbols, rules are represented internally as items. An item has a marker in one position of the right hand side of the rule indicating that symbols to the left of the marker have already been recognized, and symbols to the right of the marker must still be recognized in order to match this rule. An LR(1) parser groups the items into states such that an item is in a state if and only if the item can match the current input to the parser. These states, referred to as item sets, and the transitions between them can be represented by a deterministic finite automaton (DFA). All of this information can be stored in a table to facilitate parsing. Rows represent states of the parser, and each row's corresponding columns indicate a course of action to be taken based on the grammar symbol that identifies each column. (From Michael James at RPI). Reference: BYACC/Java is an extension of the Berkeley v 1.8 YACC-compatible parser generator. Standard YACC takes a YACC source file, and generates one or more C files from it, which if compiled properly, will produce a LALR-grammar parser. This is useful for expression parsing, interactive command parsing, and file reading.

  • Projects
    • Assignments
      • Examination Questions
        • Bibliography

        © J.A.N. LEE
        Last Updated 99/09/16