|
|
 |
LEXICAL ANALYSIS
Objectives
To discover the relationship between linear grammars and Finite State Machines (FSM), and extend the concept of FSM as a recognizer to an action augmented implementation.
Notes
- The lexical analyzer is the first component of a compiler and is responsible for the first stage of recognizing the top level elements of the program;
- The basic driver of a lexical analyzer is a Finite State Machine (FSM) with actions appended to the intermediate states; **
- The FSM is derived from a linear grammar representation of the upper level of the language, recognizing such elements as:
- numbers
- identifiers
- begin-end blocks
- statements
- program units (functions, procedures, etc.)
- The output from a lexical analyzer is a "tokenized" code which identifies the elements of the code with tags;
- Commonly the output is also in a fixed format so that the compiler does not have to repeat the character by character scanning of the program;
- The lexical analyzer is associated with the initial development of the symbol table.
A study of formal languages shows that there is a equivalence between linear grammars, regular expressions, and finite state machines. Consequently given a linear grammar or regular expression for a language component (such as an identifier) a finite state machine can be built easily.
Projects
- Based on a linear grammar for floating point number representations, build a action augmented FSM that will recognize floating point numbers and output the binary equivalent (probably in "E-notation").
Assignments
Examination Questions
- Note:
Finite State Machines with actions basically take two forms:
- Those in which the actions are attached to the states, so that the action is taken when that state is reached, and
- Those in which the actions are associated with the state transitions, where the action is taken as a particular transition is followed.
In some cases a combination of approaches is useful.
|