NAME
  yacc - Generates an LR(1) parsing program from input consisting of a
  context-free grammar specification

SYNOPSIS

  yacc [-vltds]	[-b prefix] [-N	number]	[-p symbol_prefix] [-P pathname]
  grammar

  The yacc command converts a context-free grammar specification into a	set
  of tables for	a simple automaton that	executes an LR(1) parsing algorithm.

FLAGS

  -b prefix
      Uses prefix instead of y as the prefix for all output filenames
      (prefix.tab.c, prefix.tab.h, and prefix.output).

  -d  Produces the y.tab.h file, which contains	the #define statements that
      associate	the yacc-assigned token	codes with your	token names.  This
      allows source files other	than y.tab.c to	access the token codes by
      including	this header file.

  -l  Includes no #line	constructs in y.tab.c.	Use this only after the	gram-
      mar and associated actions are fully debugged.

  -N number
      Provides yacc with extra storage for building its	LALR tables, which
      may be necessary when compiling very large grammars.  Thenumber should
      be larger	than 40,000 when you use this flag.

  -p symbol_prefix
      Allows multiple yacc parsers to be linked	together.  Use symbol_prefix
      instead of yy to prefix global symbols.

  -P pathname
      Specifies	an alternative parser (instead of /usr/ccs/lib/yaccpar).  The
      pathname specifies the filename of the skeleton to be used in place of
      yaccpar).

  -s  Breaks the yyparse() function into several smaller functions.  Because
      its size is somewhat proportional	to that	of the grammar,	it is possi-
      ble for yyparse()	to become too large to compile,	optimize, or execute
      efficiently.

  -t  Compiles run-time	debugging code.	 By default, this code is not
      included when y.tab.c is compiled.  If YYDEBUG has a nonzero value, the
      C	compiler (cc) includes the debugging code, whether or not the -t flag
      was used.	 Without compiling this	code, yyparse()	will run more
      quickly.

  -v  Produces the y.output file, which	contains a readable description	of
  The yacc program reads its skeleton parser from the file
  /usr/ccs/lib/yaccpar.	 Use the environment variable PARSER to	specify
  another location for yacc to read from.

  Syntax for yacc Input


  This section contains	a formal description of	the yacc input file (or	gram-
  mar file), which is normally named with a .y suffix.	The section provides
  a listing of the special values, macros, and functions recognized by yacc.

  The general format of	the yacc input file is:

       [ definitions ]
       %%
       [ rules ]
       [ %%
       [ user functions	] ]

  where

  definitions
	    Is the section where you define the	variables to be	used later in
	    the	grammar, such as in the	rules section.	It is also where
	    files are included (#include) and processing conditions are
	    defined.  This section is optional.

  rules	    Is the section that	contains grammar rules for the parser.	A
	    yacc input file must have a	rules section.

  user functions
	    Is the section that	contains user-supplied functions that can be
	    used by the	actions	in the rules section.  This section is
	    optional.

  The NULL character must not be used in grammar rules or literals.  Each
  line in the definitions can be:

  %{

  %}	    When placed	on lines by themselves,	these enclose C	code to	be
	    passed into	the global definitions of the output file.  Such
	    lines commonly include preprocessor	directives and declarations
	    of external	variables and functions.

  %token [type]	token [number] [name [number]...
	    Lists tokens or tty	symbols	to be used in the rest of the input
	    file.  This	line is	needed for tokens that do not appear in	other
	    % definitions. If type is present, the C type for all tokens on
	    this line is declared to be	the type referenced by type. If	a
	    positive integer number follows a token, that value	is assigned
	    to the token.
	    that the token cannot be used associatively.

  %start symbol
	    Indicates the highest-level	production rule	to be reduced; in
	    other words, the rule where	the parser can consider	its work done
	    and	terminate.  If this definition is not included,	the parser
	    uses the first production rule.  The symbol	must be	non-terminal
	    (not a token).

  %type	< type > symbol	[ symbol ... ]
	    Defines each symbol	as data	type type, to resolve ambiguities. If
	    this construct is present, yacc performs type checking and other-
	    wise assumes all symbols to	be of type integer.

  %union union-def
	    Defines the	yylval global variable as a union, where union-def is
	    a standard C definition in the format:
		 { type	member ; [ type	member ; ... ] }

	    At least one member	should be an int.  Any valid C data type can
	    be defined,	including structures.  When you	run yacc with the -d
	    option, the	definition of yylval is	placed in the y.tab.h file
	    and	can be referred	to in a	lex input file.

  Every	token (non-terminal symbol) must be listed in one of the preceding %
  definitions.	Multiple tokens	can be separated by white space	or commas.
  All the tokens in %left, %right, and %nonassoc definitions are assigned a
  precedence with tokens in later definitions having precedence	over those in
  earlier definitions.

  In addition to symbols, a token can be literal character enclosed in single
  quotes.  (Multibyte characters are recognized	by the lexical analyzer	and
  returned as tokens.) The following special characters	can be used, just as
  in C programs:

  \a Alert

  \n Newline

  \t Tab

  \v Vertical tab

  \r Carriage Return

  \b Backspace

  \f Form Feed

  \\ Backslash

  \' Single Quote
  minal	symbols	must be	declared in %token definitions.

  Each symbol-sequence represents an alternative way of	reducing the rule.  A
  symbol can appear recursively	in its own rule.  Always use left-recursion
  (where the recursive symbol appears before the terminating case in
  symbol-sequence).

  The specific sequence:

       %prec token

  indicates that the current sequence of symbols is to be preferred over oth-
  ers, at the level of precedence assigned to token in the definitions sec-
  tion.

  The specially	defined	token error matches any	unrecognized sequence of
  input.  This token causes the	parser to invoke the yyerror function.	By
  default, the parser tries to synchronize with	the input and continue pro-
  cessing it by	reading	and discarding all input up to the symbol following
  error.  (You can override this behavior through the yyerrok action.)	If no
  error	token appears in the yacc input	file, the parser exits with an error
  message upon encountering unrecognized input.

  The parser always executes action after encountering the symbol that pre-
  cedes	it.  Thus, an action can appear	in the middle of a symbol-sequence,
  after	each symbol-sequence, or after multiple	instances of symbol-sequence.
  In the last case, action is executed when the	parser matches any of the
  sequences.

  The action consists of standard C code within	braces and can also take the
  following values, variables, and keywords.

  yylval    If the token returned by the yylex function	is associated with a
	    significant	value, yylex should place the value in this global
	    variable.  By default, yylval is of	type int.  The definitions
	    section can	include	a %union definition to associate with other
	    data types,	including structures.  If you run yacc with the	-d
	    option, the	full yylval definition is passed into the y.tab.h
	    file for access by lex

  yyerrok   Causes the parser to start parsing tokens immediately after	an
	    erroneous sequence,	instead	of performing the default action of
	    reading and	discarding tokens up to	a synchronization token.  The
	    yyerrok action should appear immediately after the error token.

  $ [ <type> ] n
	    Refers to symbol n,	a token	index in the production, counting
	    from the beginning of the production rule, where the first symbol
	    after the colon is $1.  The	type variable is the name of one of
	    the	union lines listed in the %union directive in the declaration
	    section.  The <type> syntax	(non-standard) allows the value	to be
	    cast to a specific data type.  Note	that you will rarely need to

  The following	functions, which are contained in the user functions section,
  are invoked within the yyparse function generated by yacc.

  yylex()   The	lexical	analyzer called	by yyparse to recognize	each token of
	    input.  Usually this function is created by	lex.  yylex reads
	    input, recognizes expressions within the input, and	returns	a
	    token number representing the kind of token	read.  The function
	    returns an int value.  A return value of 0 (zero) means the	end
	    of input.

	    If the parser and yylex do not agree on these token	numbers,
	    reliable communication between them	cannot occur. For (one char-
	    acter) literals, the token is simply the numeric value of the
	    character in the current character set. The	numbers	for other
	    tokens can either be chosen	by yacc, or by the user. In either
	    case, the #define construct	of C is	used to	allow yylex () to
	    return these numbers symbolically. The #define statements are put
	    into the code file,	and the	header file if that file is
	    requested. The set of characters permitted by yacc in an identif-
	    ier	is larger than that permitted by C. Token names	found to con-
	    tain such characters will not be included in the #define declara-
	    tions.

	    If the token numbers are chosed by yacc, the tokens	other than
	    literals, are assigned numbers greater than	256, although no
	    order is implied. A	token can be explicitly	assigned a number by
	    following its first	appearance in the declaration section with a
	    number. Names and literals not defined this	way retain their
	    default definition.	All assigned token numbers are unique and
	    distinct from the token numbers used for literals.If duplicate
	    token numbers cause	conflicts in parser generation,	yacc reports
	    an error; otherwise, it is unspecified whether the token assign-
	    ment is accepted or	an error is reported.

	    The	end of the input is marked by a	special	token called the end-
	    marker that	has a token number that	is zero	or negative. All lex-
	    ical analyzers return zero or negative as a	token number upon
	    reaching the end of	their input. If	the tokens up to, but not
	    excluding, the endmarker form a structure that matches the start
	    symbol, the	parser accepts the input.  If the endmarker is seen
	    in any other context, it is	considered an error.

  yyerror(string)
	    The	function that the parser calls upon encountering an input
	    error.  The	default	function, defined in liby.a, simply prints
	    string to the standard error.  The user can	redefine the func-
	    tion.  The function's type is void.

  The liby.a library contains default main() and yyerror() functions.  These
  look like the	following, respectively:


  Comments, in C syntax, can appear anywhere in	the user functions or defini-
  tions	sections.  In the rules	section, comments can appear wherever a	sym-
  bol is allowed.  Blank lines or lines	consisting of white space can be
  inserted anywhere in the file, and are ignored.

EXAMPLES

  This section describes the example programs for the lex and yacc commands,
  which	together create	a simple desk calculator program that performs addi-
  tion,	subtraction, multiplication, and division operations.  The calculator
  program also allows you to assign values to variables	(each designated by a
  single lowercase ASCII letter), and then use the variables in	calculations.
  The files that contain the program are as follows:

  calc.l
      The lex specification file that defines the lexical analysis rules.

  calc.y
      The yacc grammar file that defines the parsing rules and calls the
      yylex() function created by lex to provide input.

  The remaining	text expects that the current directory	is the directory that
  contains the lex and yacc example program files.

  Compiling the	Example	Program

  Perform the following	steps to create	the example program using lex and
  yacc:

   1.  Process the yacc	grammar	file using the -d flag.	 The -d	flag tells
       yacc to create a	file that defines the tokens it	uses in	addition to
       the C language source code.
	    yacc -d calc.y


   2.  The following files are created (the *.o	files are created temporarily
       and then	removed):

       y.tab.c
	   The C language source file that yacc	created	for the	parser.

       y.tab.h
	   A header file containing #define statements for the tokens used by
	   the parser.

   3.  Process the lex specification file:
	    lex	calc.l


   4.  The following file is created:

	   The object file for lex.yy.c.

       calc
	   The executable program file.

       You can then run	the program directly by	entering:


	    calc


       Then enter numbers and operators	in calculator fashion.	After you
       press <Return>, the program displays the	result of the operation.  If
       you assign a value to a variable	as follows, the	cursor moves to	the
       next line:
	    m=4	<Return>
	    _


       You can then use	the variable in	calculations and it will have the
       value assigned to it:
	    m+5	<Return>
	    9


  The Parser Source Code

  The text that	follows	shows the contents of the file calc.y.	This file has
  entries in all three of the sections of a yacc grammar file:	declarations,
  rules, and programs.

       %{
       #include	<stdio.h>

       int regs[26];
       int base;

       %}

       %start list

       %token DIGIT LETTER

       %left '|'
       %left '&'
       %left '+' '-'
       %left '*' '/' '%'
       %left UMINUS /*supplies precedence for unary minus */

       %%      /*beginning of rules section */

       list    :       /*empty */
		       {       $$ = $1 * $3;   }
	       |       expr '/'	expr
	       {       $$ = $1 / $3;   }
	       |       expr '%'	expr
		       {       $$ = $1 % $3;   }
	       |       expr '+'	expr
		       {       $$ = $1 + $3;   }
	       |       expr '-'	expr
		       {       $$ = $1 - $3;   }
	       |       expr '&'	expr
		       {       $$ = $1 & $3;   }
	       |       expr '|'	expr
		       {       $$ = $1 | $3;   }
	       |       '-' expr	%prec UMINUS
		       {       $$ = -$2;       }
	       |       LETTER
		       {       $$ = regs[$1];  }
	       |       number
	       ;

       number  :       DIGIT
		       {       $$ = $1;	base = ($1==0) ? 8:10; }
	       |       number  DIGIT
		       {       $$ = base * $1 +	$2;    }
	       ;

       %%
       main()
       {
	       return(yyparse());
       }

       yyerror(s)
       char *s;
       {
	       fprintf(stderr,"%s\n",s);
       }

       yywrap()
       {
	       return(1);
       }


  Declarations Section

  This section contains	entries	that perform the following functions:

    +  Includes	standard I/O header file.

    +  Defines global variables.

  tines	are included in	this file, you do not need to use the yacc library
  when processing this file.

  main()     The required main program that calls yyparse() to start the pro-
	     gram.

  yyerror(s) This error	handling routine only prints a syntax error message.

  yywrap()   The wrap-up routine that returns a	value of 1 when	the end	of
	     input occurs.

  The Lexical Analyzer Source Code

  This shows the contents of the file calc.lex.	 This file contains include
  statements for standard input	and output, as well as for the y.tab.h file.
  The yacc program generates that file from the	yacc grammar file informa-
  tion,	if you use the -d flag with the	yacc command.  The file	y.tab.h	con-
  tains	definitions for	the tokens that	the parser program uses.  In addi-
  tion,	calc.lex contains the rules used to generate the tokens	from the
  input	stream.

       %{

       #include	<stdio.h>
       #include	"y.tab.h"
       int c;
       extern YYSTYPE yylval;
       %}
       %%
       " "     ;
       [a-z]   {
		       c = yytext[0];
		       yylval =	c - 'a';
		       return(LETTER);
	       }
       [0-9]   {
		       c = yytext[0];
		       yylval =	c - '0';
		       return(DIGIT);
	       }
       [^a-z 0-9]      {
		       c = yytext[0];
		       return(c);
		       }


FILES

  y.output   A readable	description of parsing tables and a report on con-
	     flicts generated by grammar ambiguities.

  y.tab.c    Output file.

RELATED	INFORMATION

  Commands:  lex(1).

  Programming Support Tools