583 lines
14 KiB
Groff
583 lines
14 KiB
Groff
.\" $NetBSD: yacc.1,v 1.6 2016/06/06 08:22:52 wiz Exp $
|
|
.\"
|
|
.\" Copyright (c) 1989, 1990 The Regents of the University of California.
|
|
.\" All rights reserved.
|
|
.\"
|
|
.\" This code is derived from software contributed to Berkeley by
|
|
.\" Robert Paul Corbett.
|
|
.\"
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
.\" modification, are permitted provided that the following conditions
|
|
.\" are met:
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
.\" 3. Neither the name of the University nor the names of its contributors
|
|
.\" may be used to endorse or promote products derived from this software
|
|
.\" without specific prior written permission.
|
|
.\"
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
.\" SUCH DAMAGE.
|
|
.\"
|
|
.\" from: @(#)yacc.1 5.7 (Berkeley) 7/30/91
|
|
.\" from: Id: yacc.1,v 1.24 2014/10/06 00:03:48 tom Exp
|
|
.\" $NetBSD: yacc.1,v 1.6 2016/06/06 08:22:52 wiz Exp $
|
|
.\"
|
|
.Dd October 5, 2014
|
|
.Dt YACC 1
|
|
.Os
|
|
.Sh NAME
|
|
.Nm yacc
|
|
.Nd an
|
|
.Tn LALR(1)
|
|
parser generator
|
|
.Sh SYNOPSIS
|
|
.Nm
|
|
.Op Fl BdgilLPrtvVy
|
|
.Op Fl b Ar file_prefix
|
|
.Op Fl o Ar output_file
|
|
.Op Fl p Ar symbol_prefix
|
|
.Ar filename
|
|
.Sh DESCRIPTION
|
|
.Nm
|
|
reads the grammar specification in the file
|
|
.Ar filename
|
|
and generates an
|
|
.Tn LALR(1)
|
|
parser for it.
|
|
The parsers consist of a set of
|
|
.Tn LALR(1)
|
|
parsing tables and a driver routine
|
|
written in the C programming language.
|
|
.Nm
|
|
normally writes the parse tables and the driver routine to the file
|
|
.Pa y.tab.c .
|
|
.Pp
|
|
The following options are available:
|
|
.Bl -tag -width symbol_prefixXXX
|
|
.It Fl b Ar file_prefix
|
|
The
|
|
.Fl b
|
|
option changes the prefix prepended to the output file names to
|
|
the string denoted by
|
|
.Ar file_prefix .
|
|
The default prefix is the character
|
|
.Ar y .
|
|
.It Fl B
|
|
Create a backtracking parser (compile-type configuration for
|
|
.Nm ) .
|
|
.It Fl d
|
|
The
|
|
.Fl d
|
|
option causes the header file
|
|
.Pa y.tab.h
|
|
to be written.
|
|
It contains #define's for the token identifiers.
|
|
.It Fl g
|
|
The
|
|
.Fl g
|
|
option causes a graphical description of the generated
|
|
.Tn LALR(1)
|
|
parser to be written to the file
|
|
.Pa y.dot
|
|
in graphviz format, ready to be processed by
|
|
.Xr dot 1 .
|
|
.It Fl i
|
|
The
|
|
.Fl i
|
|
option causes a supplementary header file
|
|
.Pa y.tab.i
|
|
to be written.
|
|
It contains extern declarations
|
|
and supplementary #define's as needed to map the conventional
|
|
.Nm
|
|
yy-prefixed names to whatever the
|
|
.Fl p
|
|
option may specify.
|
|
The code file, e.g.,
|
|
.Pa y.tab.c
|
|
is modified to #include this file
|
|
as well as the
|
|
.Pa y.tab.h
|
|
file, enforcing consistent usage of the symbols defined in those files.
|
|
The supplementary header file makes it simpler to separate compilation
|
|
of lex- and yacc-files.
|
|
.It Fl l
|
|
If the
|
|
.Fl l
|
|
option is not specified,
|
|
.Nm
|
|
will insert #line directives in the generated code.
|
|
The #line directives let the C compiler relate errors in the
|
|
generated code to the user's original code.
|
|
If the
|
|
.Fl l
|
|
option is specified,
|
|
.Nm
|
|
will not insert the #line directives.
|
|
#line directives specified by the user will be retained.
|
|
.It Fl L
|
|
Enable position processing, e.g.,
|
|
.Dq %locations
|
|
(compile-type configuration for
|
|
.Nm ) .
|
|
.It Fl o Ar output_file
|
|
Specify the filename for the parser file.
|
|
If this option is not given, the output filename is
|
|
the file prefix concatenated with the file suffix, e.g.
|
|
.Pa y.tab.c .
|
|
This overrides the
|
|
.Fl b
|
|
option.
|
|
.It Fl P
|
|
The
|
|
.Fl P
|
|
options instructs
|
|
.Nm
|
|
to create a reentrant parser, like
|
|
.Dq %pure-parser
|
|
does.
|
|
.It Fl p Ar symbol_prefix
|
|
The
|
|
.Fl p
|
|
option changes the prefix prepended to yacc-generated symbols to
|
|
the string denoted by
|
|
.Ar symbol_prefix .
|
|
The default prefix is the string
|
|
.Ar yy .
|
|
.It Fl r
|
|
The
|
|
.Fl r
|
|
option causes
|
|
.Nm
|
|
to produce separate files for code and tables.
|
|
The code file is named
|
|
.Pa y.code.c ,
|
|
and the tables file is named
|
|
.Pa y.tab.c .
|
|
The prefix
|
|
.Dq y .
|
|
can be overridden using the
|
|
.Fl b
|
|
option.
|
|
.It Fl s
|
|
Suppress
|
|
.Dq #define
|
|
statements generated for string literals in a
|
|
.Dq %token
|
|
statement, to more closely match original
|
|
.Nm
|
|
behavior.
|
|
.Pp
|
|
Normally when
|
|
.Nm
|
|
sees a line such as
|
|
.Dq %token OP_ADD "ADD"
|
|
it notices that the quoted
|
|
.Dq ADD
|
|
is a valid C identifier, and generates a #define not only for
|
|
.Dv OP_ADD ,
|
|
but for
|
|
.Dv ADD
|
|
as well,
|
|
e.g.,
|
|
.Bd -literal -offset indent
|
|
#define OP_ADD 257
|
|
#define ADD 258
|
|
.Ed
|
|
The original
|
|
.Nm
|
|
does not generate the second
|
|
.Dq #define .
|
|
The
|
|
.Fl s
|
|
option suppresses this
|
|
.Dq #define .
|
|
.Pp
|
|
.St -p1003.1
|
|
documents only names and numbers for
|
|
.Dq %token ,
|
|
though the original
|
|
.Nm
|
|
and
|
|
.Xr bison 1
|
|
also accept string literals.
|
|
.It Fl t
|
|
The
|
|
.Fl t
|
|
option changes the preprocessor directives generated by
|
|
.Nm
|
|
so that debugging statements will be incorporated in the compiled code.
|
|
.It Fl V
|
|
The
|
|
.Fl V
|
|
option prints the version number to the standard output.
|
|
.It Fl v
|
|
The
|
|
.Fl v
|
|
option causes a human-readable description of the generated parser to
|
|
be written to the file
|
|
.Pa y.output .
|
|
.It Fl y
|
|
.Nm
|
|
ignores this option,
|
|
which
|
|
.Xr bison 1
|
|
supports for ostensible POSIX compatibility.
|
|
.El
|
|
.Sh EXTENSIONS
|
|
.Nm
|
|
provides some extensions for
|
|
compatibility with
|
|
.Xr bison 1
|
|
and other implementations of yacc.
|
|
The
|
|
.Dq %destructor
|
|
and
|
|
.Dq %locations
|
|
features are available only if
|
|
.Nm yacc
|
|
has been configured and compiled to support the back-tracking functionality.
|
|
The remaining features are always available:
|
|
.Pp
|
|
.Dv %destructor {
|
|
.Dv code }
|
|
.Dv symbol+
|
|
.Pp
|
|
Defines code that is invoked when a symbol is automatically
|
|
discarded during error recovery.
|
|
This code can be used to
|
|
reclaim dynamically allocated memory associated with the corresponding
|
|
semantic value for cases where user actions cannot manage the memory
|
|
explicitly.
|
|
.Pp
|
|
On encountering a parse error, the generated parser
|
|
discards symbols on the stack and input tokens until it reaches a state
|
|
that will allow parsing to continue.
|
|
This error recovery approach results in a memory leak
|
|
if the
|
|
.Dq YYSTYPE
|
|
value is, or contains, pointers to dynamically allocated memory.
|
|
.Pp
|
|
The bracketed
|
|
.Dv code
|
|
is invoked whenever the parser discards one of the symbols.
|
|
Within
|
|
.Dv code ,
|
|
.Dq $$
|
|
or
|
|
.Dq $\*[Lt]tag\*[Gt]$
|
|
designates the semantic value associated with the discarded symbol, and
|
|
.Dq @$
|
|
designates its location (see
|
|
.Dq %locations
|
|
directive).
|
|
.Pp
|
|
A per-symbol destructor is defined by listing a grammar symbol
|
|
in
|
|
.Dv symbol+ .
|
|
A per-type destructor is defined by listing a semantic type tag (e.g.,
|
|
.Dq \*[Lt]some_tag\*[Gt] )
|
|
in
|
|
.Dv symbol+ ;
|
|
in this case, the parser will invoke
|
|
.Dv code
|
|
whenever it discards any grammar symbol that has that semantic type tag,
|
|
unless that symbol has its own per-symbol destructor.
|
|
.Pp
|
|
Two categories of default destructor are supported that are
|
|
invoked when discarding any grammar symbol that has no per-symbol and no
|
|
per-type destructor:
|
|
.Pp
|
|
The code for
|
|
.Dq \*[Lt]*\*[Gt]
|
|
is used
|
|
for grammar symbols that have an explicitly declared semantic type tag
|
|
(via
|
|
.Dq %type ) ;
|
|
.Pp
|
|
the code for
|
|
.Dq \*[Lt]\*[Gt]
|
|
is used for grammar symbols that have no declared semantic type tag.
|
|
.Pp
|
|
.Bl -tag -width "%expect-rr number" -compact
|
|
.It Dv %expect Ar number
|
|
Tell
|
|
.Nm
|
|
the expected number of shift/reduce conflicts.
|
|
That makes it only report the number if it differs.
|
|
.It Dv %expect-rr Ar number
|
|
Tell
|
|
.Nm
|
|
the expected number of reduce/reduce conflicts.
|
|
That makes it only report the number if it differs.
|
|
This is (unlike
|
|
.Xr bison 1 )
|
|
allowable in
|
|
.Tn LALR(1)
|
|
parsers.
|
|
.It Dv %locations
|
|
Tell
|
|
.Nm
|
|
to enable management of position information associated with each token,
|
|
provided by the lexer in the global variable
|
|
.Dv yylloc ,
|
|
similar to management of semantic value information provided in
|
|
.Dv yylval .
|
|
.Pp
|
|
As for semantic values, locations can be referenced within actions using
|
|
.Dv @$
|
|
to refer to the location of the left hand side symbol, and
|
|
.Dv @N
|
|
.Dv ( N
|
|
an integer) to refer to the location of one of the right hand side
|
|
symbols.
|
|
Also as for semantic values, when a rule is matched, a default
|
|
action is used the compute the location represented by
|
|
.Dv @$
|
|
as the beginning of the first symbol and the end of the last symbol
|
|
in the right hand side of the rule.
|
|
This default computation can be overridden by
|
|
explicit assignment to
|
|
.Dv @$
|
|
in a rule action.
|
|
.Pp
|
|
The type of
|
|
.Dv yylloc
|
|
is
|
|
.Dv YYLTYPE ,
|
|
which is defined by default as:
|
|
.Bd -literal -offset indent
|
|
typedef struct YYLTYPE {
|
|
int first_line;
|
|
int first_column;
|
|
int last_line;
|
|
int last_column;
|
|
} YYLTYPE;
|
|
.Ed
|
|
.Pp
|
|
.Dv YYLTYPE
|
|
can be redefined by the user
|
|
.Dv ( YYLTYPE_IS_DEFINED
|
|
must be defined, to inhibit the default)
|
|
in the declarations section of the specification file.
|
|
As in
|
|
.Xr bison 1 ,
|
|
the macro
|
|
.Dv YYLLOC_DEFAULT
|
|
is invoked each time a rule is matched to calculate a position for the
|
|
left hand side of the rule, before the associated action is executed;
|
|
this macro can be redefined by the user.
|
|
.Pp
|
|
This directive adds a
|
|
.Dv YYLTYPE
|
|
parameter to
|
|
.Fn yyerror .
|
|
If the
|
|
.Dq %pure-parser
|
|
directive is present,
|
|
a
|
|
.Dv YYLTYPE
|
|
parameter is added to
|
|
.Fn yylex
|
|
calls.
|
|
.It Dv %lex-param Ar { Ar argument-declaration Ar }
|
|
By default, the lexer accepts no parameters, e.g.,
|
|
.Fn yylex .
|
|
Use this directive to add parameter declarations for your customized lexer.
|
|
.It Dv %parse-param Ar { Ar argument-declaration Ar }
|
|
By default, the parser accepts no parameters, e.g.,
|
|
.Fn yyparse .
|
|
Use this directive to add parameter declarations for your customized parser.
|
|
.It Dv %pure-parser
|
|
Most variables (other than
|
|
.Fa yydebug
|
|
and
|
|
.Fa yynerrs )
|
|
are allocated on the stack within
|
|
.Fn yyparse ,
|
|
making the parser reasonably reentrant.
|
|
.It Dv %token-table
|
|
Make the parser's names for tokens available in the
|
|
.Dv yytname
|
|
array.
|
|
However,
|
|
.Nm
|
|
yacc
|
|
does not predefine
|
|
.Dq $end ,
|
|
.Dq $error
|
|
or
|
|
.Dq $undefined
|
|
in this array.
|
|
.El
|
|
.Sh PORTABILITY
|
|
According to Robert Corbett:
|
|
.Bd -offset indent
|
|
Berkeley Yacc is an LALR(1) parser generator.
|
|
Berkeley Yacc has been made as compatible as possible with AT\*[Am]T Yacc.
|
|
Berkeley Yacc can accept any input specification that conforms to the
|
|
AT\*[Am]T Yacc documentation.
|
|
Specifications that take advantage of undocumented features of AT\*[Am]T
|
|
Yacc will probably be rejected.
|
|
.Ed
|
|
.Pp
|
|
The rationale in
|
|
.%U http://pubs.opengroup.org/onlinepubs/9699919799/utilities/yacc.html
|
|
documents some features of AT\*[Am]T yacc which are no longer required for
|
|
POSIX compliance.
|
|
.Pp
|
|
That said, you may be interested in reusing grammar files with some
|
|
other implementation which is not strictly compatible with AT\*[Am]T yacc.
|
|
For instance, there is
|
|
.Xr bison 1 .
|
|
Here are a few differences:
|
|
.Nm
|
|
accepts an equals mark preceding the left curly brace
|
|
of an action (as in the original grammar file
|
|
.Dv ftp.y ) :
|
|
.Bd -literal -offset indent
|
|
| STAT CRLF
|
|
= {
|
|
statcmd();
|
|
}
|
|
.Ed
|
|
.Nm
|
|
and
|
|
.Xr bison 1
|
|
emit code in different order, and in particular
|
|
.Xr bison 1
|
|
makes forward reference to common functions such as
|
|
.Fn yylex ,
|
|
.Fn yyparse
|
|
and
|
|
.Fn yyerror
|
|
without providing prototypes.
|
|
.Pp
|
|
.Xr bison 1
|
|
support for
|
|
.Dq %expect
|
|
is broken in more than one release.
|
|
For best results using
|
|
.Xr bison 1 ,
|
|
delete that directive.
|
|
.Pp
|
|
.Xr bison 1
|
|
no equivalent for some of
|
|
.Nm 's
|
|
command-line options, relying on directives embedded in the grammar file.
|
|
.Pp
|
|
.Xr bison 1
|
|
.Fl y
|
|
option does not affect bison's lack of support for
|
|
features of AT\*[Am]T yacc which were deemed obsolescent.
|
|
.Pp
|
|
.Nm
|
|
accepts multiple parameters with
|
|
.Dq %lex-param
|
|
and
|
|
.Dq %parse-param
|
|
in two forms
|
|
.Bd -literal -offset indent
|
|
{type1 name1} {type2 name2} ...
|
|
{type1 name1, type2 name2 ...}
|
|
.Ed
|
|
.Pp
|
|
.Xr bison 1
|
|
accepts the latter (though undocumented), but depending on the
|
|
release may generate bad code.
|
|
.Pp
|
|
Like
|
|
.Xr bison 1 ,
|
|
.Nm
|
|
will add parameters specified via
|
|
.Dq %parse-param
|
|
to
|
|
.Fn yyparse ,
|
|
.Fn yyerror
|
|
and (if configured for back-tracking)
|
|
+to the destructor declared using
|
|
.Dq %destructor .
|
|
.Pp
|
|
.Xr bison 1
|
|
puts the additional parameters
|
|
.Dv first
|
|
for
|
|
.Fn yyparse
|
|
and
|
|
.Fn yyerror
|
|
but
|
|
.Dv last
|
|
for destructors.
|
|
.Nm
|
|
matches this behavior.
|
|
.El
|
|
.Sh ENVIRONMENT
|
|
The following environment variable is referenced by
|
|
.Nm :
|
|
.Bl -tag -width TMPDIR
|
|
.It Ev TMPDIR
|
|
If the environment variable
|
|
.Ev TMPDIR
|
|
is set, the string denoted by
|
|
.Ev TMPDIR
|
|
will be used as the name of the directory where the temporary
|
|
files are created.
|
|
.El
|
|
.Sh TABLES
|
|
The names of the tables generated by this version of
|
|
.Nm
|
|
are
|
|
.Dq yylhs ,
|
|
.Dq yylen ,
|
|
.Dq yydefred ,
|
|
.Dq yydgoto ,
|
|
.Dq yysindex ,
|
|
.Dq yyrindex ,
|
|
.Dq yygindex ,
|
|
.Dq yytable ,
|
|
and
|
|
.Dq yycheck .
|
|
Two additional tables,
|
|
.Dq yyname
|
|
and
|
|
.Dq yyrule ,
|
|
are created if
|
|
.Dv YYDEBUG
|
|
is defined and non-zero.
|
|
.Sh FILES
|
|
.Bl -tag -width /tmp/yacc.uXXXXXXXX -compact
|
|
.It Pa y.code.c
|
|
.It Pa y.tab.c
|
|
.It Pa y.tab.h
|
|
.It Pa y.output
|
|
.It Pa /tmp/yacc.aXXXXXX
|
|
.It Pa /tmp/yacc.tXXXXXX
|
|
.It Pa /tmp/yacc.uXXXXXX
|
|
.El
|
|
.Sh DIAGNOSTICS
|
|
If there are rules that are never reduced, the number of such rules is
|
|
written to the standard error.
|
|
If there are any
|
|
.Tn LALR(1)
|
|
conflicts, the number of conflicts is also written
|
|
to the standard error.
|
|
.\" .Sh SEE ALSO
|
|
.\" .Xr yyfix 1
|
|
.Sh STANDARDS
|
|
The
|
|
.Nm
|
|
utility conforms to
|
|
.St -p1003.2 .
|